This essay explains what various research terms mean, and provides examples for each.
A random variable is a set of possible numerical outcome or value from a random experiment that is selected by chance. The outcome is a function from a sample space that is mapped into real numbers. It can be classified into: continuous random variable, which takes an infinite number of possible outcomes, or discrete which is countable.
Continuous: Experimenter wish to estimate the height of NBA basketball players. Measuring the height of each of the player rather inefficient, hence experimenter choose 30 players randomly from a set of all NBA players (sample space). Each basketball player in NBA has an equal chance of being selected. Thus each player selected for the sample is said to be a random variable. And height measurement can be anywhere between zero to infinity, hence continuous.
Discrete: number of heads when tossing a coin twice. Possible outcome or the sample space are HH, HT,TH,TT. Translated into the real line the possible number of heads, are 0,1 and 2.
Random sample is a subset of units or subjects from a population, where each unit in the population has an equal chance of being selected.
In a company consists of 300 employees (population), 30 people are selected randomly. The 30 people selected are called a random sample because they are selected by chance, each employee has an equal chance of being selected.
Mgf stands for moment generation function of x is denoted by M_x (t), exists if there is a positive number h such that the above summation exists and is finite for ‘h< t < h. By definition we can write mgf of X as:
Let X be a random variable with probability mass function f(x) and an element of S. Then:
M_x (t)=E(e^tx )=’_(x’S)”e^tx P(X=x)’ , if x is discrete
M_x (t)=E(e^tx )=’_(-‘)^”’e^tx f(x) dx’ , if x is continuous
Also, by theorem if X has mgf M_x (t) then
EX^n=’M_x’^((n) ) (0),
‘M_x’^((n) ) (0)=d^n/’dt’^n M_x (t)|_0
Is the nth moment which can be found by deriving M_x(t) nth times and evaluating it at t=0.
Let x be distributed exponentially, hence the probably distribution function of x is
f(x)=1/?? e^((-x)/??) 0<x<‘ , ??>0
Obtaining the mgf we take:
E[e^tx ]=’_0^”'(e^tx 1/?? e^((-x)/??) dx@ )
Is the moment generating function of exponential distribution.
Observational study is a study where the investigators observe subjects and measure variables of interest without applying any treatments to the subjects.
A research study comparing the risk of developing heart disease between people who exercise regularly (at least three times a week for an hour each session) and people who do not exercise regularly. Experimenter only observe without altering any variable in the experiment or the environment.
Experimental study is a study where the investigators apply treatments to the experimental units and then measure the effect of the treatments on the variable of interest.
A research study comparing weight loss on two groups of people. The first group is assigned to take a dietary supplement and the second group is not assigned to take any supplement. Weight loss is the variable of interest and different dietary supplement formulas are the treatments applied.
Cross-sectional study is a type of observational study that is conducted without manipulating the environment, in which data is collected from different groups of observation at one specific point in time.
A study to measure blood sugar levels of people who drink alcohol regularly and people who don’t drink alcohol regularly, across two age groups, over 35 and under 35. Blood sugar levels are measured at a specific point in time, without considering its past and future levels and without altering the study environment.
Retrospective study is a study that looks backward in time, in which the data is collected from historically or past records.
A study about development of an illness or disorder such as Post Traumatic Stress Disorder (PTSD). In the study, investigators compare two groups of adults who suffer and do not suffer from PTSD. Investigator might be interested in factors such as if the subjects have had any traumatic experience, how severe the experience was and when the timing of the experience and analyze if there is a correlation between those factors (or combination of those factors) with PTSD.
A study about development of an illness or disorder such as BPD (Borderline personality disorder). In the study, investigators compare two groups of adults who have and do not have ADHD. Investigator identifies if they have experienced any traumatic childhood experience and analyze the relationship between ADHD and traumatic childhood experience.
Prospective study is a research method that watches for the outcomes of the study objective during the study period and learn the factors or events observed during the study. Investigators record events that happen during the study and then evaluate the results at the end of the study period. It is often used to observe a development of a disease in which subjects are being studied and watched over the study period. The outcome of interest should be common in the group and should be able to be distinguished from outcomes that may have arisen by chance.
A study of children with alcoholic parents in which the children are observed during the study period, events and factors such as geographic location, religious involvement, academic are observed and examined later. During the study period investigators watch for an outcome, that is if the children will become alcoholic too. The objective is to examine children of alcoholic parents and risk of becoming alcoholic.
Element is every distinct object that is a contained in a set.
In a set of odd numbers, any odd number such as 1, 3 and 9 is an element of the set.
In a set of possible outcome of a coin toss, head and tail are elements of the set.
A set of alphabetical letters contain 26 different alphabets or elements which are the letters a-z.
Confounding is when the effect of a factor cannot be distinguished from the effect of another factor. When this happens the two factors are said to be confounded.
In a study of coffee drinking and lung cancer, people who drink coffee are more like to smoke cigarette. The results may seem to show that coffee drinking increases the risk of lung cancer, even though this cause of it could be the smoking and not drinking coffee. Here the effect of drinking coffee and smoking cigarettes cannot be distinguished and said to be confounded.
Blinding in design of experiment occurs when the experimenter does not know which treatment is assigned to which unit. Double-blind occurs in design of experiment occurs when both the experimenter and the experimenter do not know which treatment was assigned to which participant.
Blinding helps to avoid biasness in the experiment.
In a drug test experiment blinding occurs when the investigator does not know which formula is assigned to which participant, avoiding biasness when evaluating the result. Double blinding occurs when the participant also does not know which formula was assigned to her/him, resulting in a more objective result because it eliminates biasness that may result from placebo effect.
Blocks or blocking is a technique in design of experiment in which experimental units are assigned into groups (blocks) to improve the precision of the experiment when a known and controllable nuisance factor (block) exists in the experiment.
In a tire brand experiment, the objective is to measure tire wear of five different tire brands. Each brand is installed in five different cars that are driven by five different drivers. Here, the cars and drivers are called blocks because and can be assigned in a way that allows the experimenter to justify variability due to the nuisance factor (blocks).
Completely randomized design is a method in analysis of variance to analyze the effect of different treatment levels on response variable. In the experiment a number of treatments are being investigated, each treatment level is replicated n times and treatments are assigned to the experimental units in a completely random manner
In a tire brand experiment, five different tire brands are being investigated without any blocks (known nuisance factor) in the design. Each treatment is replicated, for example, three times, thus fifteen experimental unit are needed. Treatments are assigned randomly to the units by obtaining a permutation of fifteen numbers.
Completely randomized block design is an experimental design in which experimental units are assigned into blocks (known and controllable nuisance factor) of the same characteristics. The treatments are randomized within each block and each treatment is contained once in every block (complete). By using this design, investigator is able to separate source of variation due to the block and improve the precision of the experiment.
In the tire brand experiment, the objective is to measure tire wear of five different tire brands. The tires are tested using five different cars thus each car is considered a block. Each tire brand is tested with each of the car once, in other words, each treatment (tire brand) is contained in every block once. Using this design, investigator will be able to set aside variability due to cars and focus on the factor of interest, tire brand.
Predictor variable is an independent variable in the experiment that affect the outcome of the experiment, often called independent variable.
When projecting the price of houses in a certain area, several variables that might be good predictors of the response are square footage, number of bedrooms, number of bedrooms and access to school.
Response variable is a variable of interest, an outcome or dependent variable in an experiment that is affected by the predictor or independent variable.
The number of houses sold in a year is the response that is affected by predictor variables such as mortgage interest rate and house price.
Treatment is a combination of independent variables, or factor levels in an experiment that is applied to the experimental unit.
In a drug test experiment, investigator is interested in testing the effectivity of several different drug formulas to the experimental unit. Different drug formulas are said to be the treatment applied to the experimental unit.
Subject is an experimental unit to whom the study or treatment is being applied.
In a drug test experiment, different formulas (treatments) are tested on humans, which are the subject of the experiment.
Control is a subject group of subjects that receives no or neutral treatment in the experiment. The investigator uses control group to compare the result of the group that receives treatment to the result of the control group (normal state).
In a drug test experiment a group of experimental does not take any drug formulas or might take a placebo pill that does not have any effect, while the others receive treatments such as taking different drug formulas. Thus experimenter will be able to compare the effectivity of different drug formulas to the neutral, non-treated state.
Randomization is a statistical method of selecting experimental units in a random manner such that each element has an equal chance of being selected. The method is used to reduce bias in the experiment and spread the effect of unknown and uncontrollable nuisance factor.
In a complete randomized design experiment, the experimenter wish to investigate the effect of three different weight gain supplement formula. Each treatment is repeated five times, hence fifteen pigs are needed for the experiment. To reduce bias, such as the first five pigs might be from the same family who are genetically more likely to gain weight, randomization is performed. This is done by obtaining a random permutation of 1 to 15, and assign the first, second and third five pigs to the first, second and third treatment group.