In 2018, PlanEval contributed to the impact assessment of a project supporting smallholder producers of cacao, pepper and coffee in São Tomé and Principe, a two-island state in the Gulf of Guinea. It is Plan Eval’s second project on the African small-island state, and a first for Pauline Mauclet, researcher at Plan since 2017 and Field Coordinator for the project. In this blogpost, she reflects on some of the main challenges of collecting data for a randomized controlled trial and illustrates these challenges with the project in São Tomé and Príncipe.
About the project
Sao Tome and Principe is a small-island state located in the Gulf of Guinea at about 250 kilometers from the West-African coast. The former Portuguese colony has a long history of producing cacao. Under colonization, the Portuguese built large plantation properties, commonly called roças, for the production of cane sugar and later cacao. Although production has known ups and downs both during colonization and after, agriculture has always been the driver of the country’s economy. When organic chocolate started becoming a valuable good at the beginning of the years 2000, the International Fund for Agricultural Development (IFAD) saw an opportunity to boost the country’s economy, which had suffered from the drop in (non organic) cacao prices in the 1990s. That is how the project PAPAFPA, or Participatory Smallholder Agriculture and Artisanal Fisheries Development Programme, came to light. The programme was followed and complemented by the PAPAC project (2004 – 2014). The programme interventions supported the development of an export cooperative (PAPAFPA) and facilitated the production of certified organic family plantations (PAPAC) to increase agricultural productivity, strengthen the existing producers’ associations and enhance access to markets through the provision of training, small infrastructure, as well as financial and managerial support to the cooperatives. After an initial success with the cacao value chain, the programme expanded its activity and applied the same concept to the production of organic pepper and coffee and a second cacao cooperative was created.
Working in the field is rarely an easy task. Field experiments do not enjoy the same level of control as lab experiments, where every observed factor can be carefully controlled and treatment allocation is (usually) randomized. When designing the most appropriate methodology for an evaluation, researchers need to account for uncontrollable and sometimes unobservable factors. In project or programme evaluations for example, the format of the programme itself can prevent randomization, causing additional selection biases which need to be accounted for.
For the evaluation of the PAPAFPA and PAPAC programmes, the research team was faced with the following challenges:
1. Absence of baseline data
Absence of baseline data is a relatively common issue in project and programme evaluations, although more and more organizations are now foreseeing a budget both for the baseline and endline evaluation of their programmes. However suboptimal, the issue may be partially solved through retrospective questions relying on the respondent’s recollection. For this evaluation, the research team designed a questionnaire including questions about the producer’s situation at the end of the project (at the time when the data is being collected), and also about the situation before the implementation of the programme. Note that this exercise is usually easier for beneficiaries than for nonbeneficiary individuals, since the time marker (before/after participation) is much clearer for participants than for non-participants (e.g. what was the situation in, say, 2005?).
2. Programme eligibility criteria and the resulting Selection bias
The immediate beneficiaries of both programmes were the cooperatives which had been created under the PAPAFPA programme. The producers who sold their production to the cooperatives were indirect beneficiaries to the programme. They were selected by the cooperatives through their producers’ organizations based on a set of entry requirements. This implies that the selection of candidates wasn’t randomized, but instead based on observed characteristics, resulting in a non-random treatment allocation and causing a potential selection bias among the group of beneficiaries. If the selection criteria are correlated with the outcome (which they usually are), one needs to define a methodology that will account for the selection bias and make sure that treatment and control groups are comparable. Otherwise, there is no guarantee that the estimated difference in outcome between treatment and control is due to the treatment (and only the treatment), as it could potentially also be explained by the difference in initial selection criteria.
Accepting the hypothesis that selection bias is mostly based on observed characteristics, the most commonly used techniques to account for bias are matching methods. Matching methods, as their name suggests, couple beneficiary and nonbeneficiary units (such as families, households and plots) based on observed characteristics. The technique does not account for any unobserved factors affecting participation.
The IFAD research team chose to use the Propensity-Score-Matching technique, which involves only comparing treatment and control households that are matched according to baseline and target variables. In practice, this included variables linked to the probability of inclusion, along with the level of certain outcome variables (e.g. level of production; level of income) at baseline (pre-project).
3. Incomplete list of beneficiaries and Multiple Treatments
In order to match beneficiary producers with nonbeneficiary ones, the research team needed a complete list of project participants, as well as the treatment received by each. The beneficiary group included producers from the three value chains (cacao, coffee and pepper) who had benefited from both the PAPAFPA and PAPAC programmes, as well as producers who only benefited from the PAPAC programme.
However, not all four cooperatives managed to provide the full list of beneficiary producers, indicating the extent of support received and the producers’ organizations to which they belonged.
4. Finding a sufficiently large and representative control group (and the risk of contamination)
In addition to complete data about the beneficiary group, the research team needed access to a list of communities with similar characteristics to the communities that were exposed to the projects in order to identify producers with a comparable profile to those who received the treatment at baseline. However, at the start of the evaluation such a list didn’t exist and there wasn’t any national farmers’ registry that could serve as a basis to identify potential candidates for the control group.
An additional challenge was related to the programmes’ geographical coverage and the potential spillover effects to neighbouring nonbeneficiary communities. Overall, a total of 108 communities benefited from the two projects. Due to the nature of the programme and its interventions, there is a real possibility of spillover effects from beneficiary communities to the nonbeneficiary neighbouring communities. The islands of São Tomé and Príncipe are small and people know each other well. Most people have family living in a nearby community and our experience in the field showed that people move around a lot from one community to another. These neighbouring communities therefore cannot be considered within the control group, since they might have indirectly benefited from the programmes.
Considering the initial challenges, the IFAD team conducted preliminary visits to the field in order to consolidate the cooperatives’ lists of beneficiaries and identify, together with specialists from the Project Implementation Team and the cooperatives’ leaders, the communities that were eligible to enter the control group. From this initial exercise, only 36 “pure” eligible control communities were identified. Considering that a sufficient number of nonbeneficiary farmers from these 36 communities had to be matched with beneficiary farmers from each of the three value chains in order for the Propensity-Score-Matching to ensure common support , there was a real risk of not reaching a sufficiently large control group. However, this hypothesis could only be confirmed or refuted after an initial enumeration exercise among treatment and control communities.
As part of the impact assessment and prior to the quantitative data collection, PlanEval’s data collection team conducted a detailed enumeration exercise to obtain a listing all households living in the treatment and control communities (over 5.000 households). Based on this listing, the IFAD team obtained an inventory of all producer households from both the treatment and control communities. The listing also collected basic information on the profile of each producer household, which was then used to perform the matching exercise for each of the three value chains.
The matching exercise turned out successful and a final sample was set up, composed of 1.687 households (799 treated; 800 untreated). In order to guarantee enough common support (max. 6% attrition bias), the data collection team was asked to collect data from a minimum of 700 treated and 800 control observations. Note that some control observations were used for more than one treated observation.
Both the listing exercise and the survey turned out to be extremely challenging considering the local conditions. Our data collection team, supervised by a team of remarkable team supervisors, conducted the listing exercise in the most remote communities of the island, as well as in larger communities such as São João dos Angolares, a vast and confusing community to those who don’t know their way through its small pathways. By the time the data collection team was applying the survey, the rainy season had started and access to the communities got more difficult. Once again, the experience of our data collection team, this time of our drivers, was essential to the evaluation’s success. Muddy roads, steep cliffs and fallen trees. They knew what to expect and were prepared for anything.
As if muddy roads and heavy rains weren’t enough, the survey took place during the country’s legislative and municipal election season. Not exactly the most recommended period to conduct a survey, since election campaigners are known to visit the communities and offer alcohol to its inhabitants. If our data collection team got confounded with campaigners from a party that wasn’t supported by the community, there was a risk for things to get violent. Aware of these risks, we decided to provide the data collection team with a neutral white uniform, clearly showing our company’s logo and the name of the programmes PAPAFPA and PAPAC. We also decided, together with the field supervisors, to test out the best time to visit the communities. It turned out that the morning and the early afternoon were most recommendable, as the campaign activities usually took place in the afternoon. The day before and after the election, the data collection activities were suspended as a precautionary measure. At any time, we were ready to suspend the activities if the situation got too heated. However, taking the necessary measures, our data collection team didn’t encounter any difficulties to conduct the survey as a result of the elections.
After three months
and a half in the field, PlanEval’s data collection team successfully applied
the final (approx. 2.5 – 3h long) household questionnaire and provided the IFAD
team with the cleaned database. I encourage you to take a look at IFAD’s final
evaluation report, which can be accessed through the following link: https://www.ifad.org/en/web/knowledge/publication/asset/41116368 .
 Out of a total of 90 potentially eligible communities, 14 were identified as counterfactual communities by more than one cooperative (therefore double counted); 14 other communities overlapped with the domain of other cooperatives; 26 communities received support from a cooperative, but at a low intensity; and 36 communities never benefited from either programmes (the “pure” communities).
 The common support condition ensures that treatment observations have comparison observations “nearby” in the propensity score distribution (Heckman, LaLonde, and Smith, 1999).
 In recent years, more attention has been given to tourism. The recent discovery of natural resources at its shores has also opened new perspectives for the future.
 The programme helped the cooperatives form agreements with international private sector buyers.
 Unlike for the production of cacao, pepper was introduced into the country’s agricultural sector by the programme. The production of organic pepper was almost inexistent before PAPAFPA.