Study population and sampling
The study was conducted over the course of six months, from August 2008 to January 2009, with the majority of fieldwork conducted between August and October of 2008. The area sampled in the study covers two divisions within the Iringa District of Tanazania bordering the Ruaha National Park and Wildlife Management Areas: Pawaga and Idodi Divisions. The Pawaga Division is comprised primarily of agro-pastoralist communities practicing subsistence living and traditional animal husbandry. The Idodi Division is comprised of similar subsistence-based agricultural villages, but in general supports a more transient agro-pastoral and pastoralist community. The Ruaha Landscape is located in the Southern Highlands of Tanzania; an area characterized by variable rainfall and prolonged periods of drought.
Study site roughly between Ruaha National Park and Iringa…
Households recruited for the study consisted of agro-pastoral and pastoralist households from the Barabaig, Maasai, and Sukuma tribes participating in a larger study on diseases that can be passed from animals to humans, as part of the Health for Animals and Livelihood Improvement (HALI) project, of which this study is a small component. HALI selected study households in order to obtain an accurate representation of households in the larger region. Criteria used in their selection process include ethnicity, socio-economic status, and geographic location. Potential candidate households were then contacted to obtain consent, meaning the household members agree to participate in the study, and were informed that livestock would be sampled, and a series of interviews would take place primarily inquiring about their health, livestock and livestock health, their income and socio-economic activity and status.
In a scientific study, it is considered ideal to have a sample that is fairly uniform. That is, a sample that shares certain traits and attributes in common, so that you can compare each household in your study to another regarding the research question, while minimizing the factors that may influence differences in results that are not controlled for in your study. Loosely translated, this means that if I were studying the effects of eating cheeseburgers vs. beats on cholesterol, I would want a sample that was very similar: similar age, similar income level, similar ethnicity, etc. A perfect sample would be Lego Men. They’re all the same height, weight, color, and typically all have the same job and therefore income. I could use them as a case-control group, and have 20 eat cheeseburgers for month, while 20 others ate no cheeseburgers but only beats. The effects would thereafter be measured, and we could be reasonable sure that all noticeable and measurable differences were due to diet. A perfectly terrible example would be GI Joe men. They’re all over the board. Just read their “classfied cards” on the back of each package. Let’s compare two: Duke is a highly educated, Caucasian male who runs the entire show. BBQ wears a mask, is a fireman (maybe even from another planet), and speaks a very strange dialect along with perfect English, maybe is mentally ill, and probably follows a different nutritional regimen than Duke. Any results of the cheeseburger/beat experiment on these two would not be so easily discernible, as BBQ may have variable cholesterol due to a host of factors ranging from genetics to on-the-job stress. No Joe!
My study had to work with the HALI sample of 160 households, and select a sub-sample of 60. I’ll get to why we chose 60 in a little bit, but first, let’s look at how I chose my households. I needed to mimic the HALI sample and include a representative proportion of ethnicity, income, and livelihood, so my study needed Barabaig, Maasai, and Sukuma as well as pastoralists and agro-pastoralists living in both Pawaga and Idodi Divisions. Tricky enough, but because my study wanted to look at diarrheal disease in calves, I also wanted to choose households that reported to HALI that diarrhea was a problem in their herds. So I mined the HALI data, and created a list of households that reported diarrhea. Then, in order to not be biased, I used a computer program called a random number generator (Random.org) to select 80 households from within the HALI sampling frame. I needed 80 in case some households did not want to participate in the study, and so I had some alternate households to choose from. We then selected households from this list, and because of conditions in the field, had to be pretty flexible in households we chose. Sometimes, just like us, pastoralists are not at home. But unlike us, pastoralists live far, far away, and driving there requires time, gas, patience, all of which equal money. I didn’t have very much money, so we really had to work with households from the list that were around on the days we were in the field.
Sample size estimates
Within each household there is a livestock herd. Within each livestock herd there is a ndama or calf herd. So far, no one has studied diarrheal disease in these calf herds, but others have studied disease in calf herds in the Iringa area near villages (not pastoralist herds). They found prevalence rates for Cryptosporidium of around 20% there. That means 1 in 5 animals was infected with Cryptosporidium. Since those studies were interested in infection rates at the animal level, and we were interested in rates of infection at the herd level, we had to choose our sample sizes a little differently.
Scientists use math to obtain a sample size. If I were looking at infection at the animal level like the other study in Iringa, we would take their 20% prevalence rate as what we would expect to find, and use it obtain a sample size large enough to predict 20% prevalence in my study as well. There is an equation you can use for this. There are also tables printed in statistics books based on a series of equations run by others that you can reference for your sample size.
Most scientists use computer programs to run their sampling equations. We decided on a sample size of 300 animals across our 60 households. This allowed us to sample just enough animals to ensure that we reached a herd infection rate of 20%, meaning 1 in 5 herds belonging to the households were infected. This was more important to us because we were interested in Cryptosporidum and Giardia infections that may be dangerous to people who closely interact with their animals. Having just one animal nearby is enough to pose a threat to the health of people who watch and interact with that animal.
Another question in sample size was how many animals to sample at each household. Since some households have small calf herds and others large calf herds, we had to vary the sample size at each household. In small herds we sampled every animal, and in larger herds we tried to sample at least 50% of the animals. In essence, this broke down to about 5 animals per household, as many herds were similar sizes. In addition to the number of animals, you need to have a protocol for the selection of animals. Typically we would randomly select an animal using a random number generator, and then target every other animal in the pen. This makes sure that we aren’t unconsciously selecting animals that look ill, or that are easy to catch, and that would bias our samples towards a certain type of animal (here we didn’t want Lego Men, but GI Joe, confusing I know…). Furthermore, we had to make sure that animals we sampled met our criteria: they had to be calves (generally under 1 year of age in pastoralist herds).
Sampling Procedure (Livestock)
Yeah, we had to put on gloves and take poop out of calves’ buttholes. Then we put the poop in little bags and mixed it with a buffered Formalin solution to preserve them like poop mummies.
Sampling Procedure (Household)
No poop here, well a lot less anyway. Instead, we gave a 35 minute interview consisting of about 35 questions to members of the households. The questions were about livestock management, livestock diseases, dangerous diseases and health of calves, and other questions about disease and health, and general livestock husbandry practices. The interviews were done in Swahili since there is a high illiteracy rate, and questions were read to the household members by a field assistant and the answers recorded. Then I measured things like the size of the livestock pens, stables, and inventoried the items in and around the property like water availability, forage, crops, property like bicycles and stuff, and how much environmental contamination (poop) was littered about the ground near to the household.
This study followed a cross-sectional design. Cross sectional designs are different from the case-control Lego Man design described above, because they are mainly suited to describing conditions at a certain time in a certain population. Kind of like finding out the number of Lego Men with swords at castles in January. What is the sword prevalence of the Lego Men at a distinct time and why? Do Viking castles have a higher prevalence of sword? What about Ninja castles? That’s basically what we did. How many calf herds had Crypto or Giardia at the time of interview/sampling (read August-January) in pastoralist herds in Iringa and why? Do Maasai herds have higher prevalence? What about herds closer to the wildlife management area? What about larger herds? You get the picture… The interview helped us record some things we could use to try and find out what factors were related to infection so we could try and deduce what could be dangerous for the young livestock, and therefore help the households keep their animals healthy so they can stay healthy.
Sword Prevalence at about 42% here…
Case definitions are what scientists use to designate the intended outcome variable. What is that? Well, in our case for example, the outcome variable is a calf infected with Crypto or Giardia. Our case definition is a calf infected with Crypto or Giardia, where infection is confirmed by laboratory analysis of its’ poop.
Laboratory analysis was done by Enos, a nice veterinary student at the Sokoine University of Agriculture in Tanzania. Basically, Enos took my poop samples and prepared microscopic slides using a kit purchased from Waterborne Inc. The kit is called the A100FLK AquaGlo Giardia/Cryptosproidium Direct Comprehensive Kit. In the preparation of the microscope slides, Enos applied a flourescein monoclonal antibody reagent to the poop solution. This reagent is specially engineered to bond with Giardia and Crypto if it is in the poop. Enos then would take the microscope slide and look at it under a fluorescence microscope, very very carefully. I have a picture he sent me of a positive sample in some poop. Check it out…
It’s not the greatest picture, but it shows how difficult it is to see Crypto in poop. Poop is messy. We all know this. It’s not a surprise. So you can see what Crypto does look like under the microscope, here’s another picture from the EPA website which isolates the oocysts, kind of a control slide…
That’s what Cryptosporidium oocysts look like in the microscope. After finding some of these little blobs, Enos then starts to count them. He counts them all over a certain area of the slide in order to generate a number of oocysts in the sample so we can determine just how infected the calf actually is. Shedding oocysts sucks, and if a calf is really infected it will shed quite a few, especially during the height of the infection before it’s body starts to adapt to the pathogen. From Enos’ calculations, we can do some calculations of our own, and find out how many oocysts per gram of poop the calf is spewing into the environment. The more the merrier? Not at all. The more oocysts per gram of feces equates with a higher degree of environmental contamination from the pathogen, as well as the more risky that environment becomes for the people living and interacting with the animal.
Once I have all of the laboratory information, then I start to analyze the data. I do some statistical analysis. I use my case definition and my outcome variable, and I start to look at how different things are associated with it. I use the questions and answers from my survey, questions like how many calves were born in the last year, and what is your calves water source, and I run what’s called a bivariate analysis. This means I look at how closely water source and infection of a calf are associated statistically. I do this in a computer program. The computer program will give me some output and some graphs, and then I select the things that are most closely associated with infection, and some other things I think are logically associated with infection, and I make a statistical model. It’s like cooking. I add ingredients (things associated with infection) and watch it cook. Then I taste it and see if it’s OK. If it’s too salty I take out some ingredients and try it again. If it’s too bland, I add some ingredients to it. I do this until me and my computer program are happy with the dish. Then we look at what the recipe is, and share it with the scientific community. The ingredients in the recipe are now called “risk factors.” The recipe is called “Prevalence of Cryptosporidium and associated risk factors in neonatal livestock in traditional pastoral livestock systems.” That’s the bee’s knees.
I’ll post more about the statistical analysis and results next time. It’ll take me awhile. I started the bivariate analysis this week, getting familiar with all the ingredients. As I still don’t have all the lab results, I have awhile to wait before we get really heavy with the data. When I do, I’ll write all about it, I promise….