You are here: Home Discussions Forums Active labour market policies for young people Impact Evaluation

Impact Evaluation

Up to Active labour market policies for young people

Impact Evaluation

Posted by Niall O'Higgins at July 26. 2010

Several contributors have mentioned the importance of impact evaluation, and I think its worth its own thread.

In principle, impact evaluation is simple - what impact evaluation does is attempt to measure, as the name suggests, the impact of programmes - that is, compare what happened once the programme was introduced with what would have happened if the programme hadn't have been introduced. This is important, if we limit ourselves to judging programmes on their 'success' rates  - we don't actually measure what the effect of the programme was - and more than that, there will be an inherent bias towards selecting participants who are more likely to succeed even in the absemce of the programme, rather than those who are most in need of support, or who will benefit most from the programme.

Here the complications begin, however, since of course "what would have happened" is not observable. In practice one needs to simulate what would have happened by using a comparison or control group who could have but didn't participate on the programme. There are various aspects to this, but there is also an extensive literature on the subject. Two references from the ILO on impact evaluation of youth programmes are Paul Ryan and Norton Grubb's book, Plain Talking in the Field of Dreams - a very good treatment of the issues in the evaluation of youth programmes. Bits of it are availble from:

But I don't know if the full text is freely available.

Another ILO reference, is my own book on youth ALMPs, which contains a chapter on evaluation. This has the advantage that it is freely available at:

Also, as I already mentioned, the World Bank had a research project on the evaluaiton of youth programmes. I repeat here the web link is:,,contentMDK:21454391~isCURL:Y~pagePK:210058~piPK:210062~theSitePK:390615,00.html

(N.B. you need to copy the whole above line into your web browser).

Probably the simplest effective way to undertake impact evaluation is to use an experimental design - that is, at the time the programme is introduced, find a group of person who are elegible to participate and randomly select the actual programme participants. There is a little more to it than that, but the main thing is that it requires forethought. One needs to design the evaluation methodology before the programme is implemented, NOT ex-post.


Another, I think equally - if not more - important issue concerns the question of what are the outcome indicators of interest. Typically, the concern is with the effects of programmes on the chances of finding employment, or sometimes on the wages of participants, but this too requires some thought. Is employment at any cost the aim? or do we want to promote good quality jobs? We might prefer, for exmaple wage employment in the formal economy to some form of infomal job, and so on.

In any event, your thoughts and contributions most welcome here too..




Re: Impact Evaluation

Posted by Valentina Barcucci at July 26. 2010

Dear Niall,


Many thanks for raising these interesting points.


Defining outcome indicators is indeed a tricky point. I guess we want them to be as simple as possible and apply to different strategies, for comparability purposes. At the same time, we want them to capture the various aspects of decent work.


I was wondering whether a decent work index would help in impact evaluation? It could take into account - and I am really brainstorming here - the average period treatment individuals have spent in employment since they were exposed to the intervention, the average wage, and an indication of social protection measures available. These indices could be weighted on educational attainment, for instance, as a proxy of disadvantage.


Another tricky issue possibly is how to strike a balance between feasibility and accuracy. As you point out in your book, evaluation by gross outcomes (without a control group) is often the only one carried out, due to lack of data and resources. When a control group is there, sometimes its size is very limited, or the selection of the sample introduces significant bias in the analysis.


These methodologies can be very misleading. However, while we want our impact evaluation to be convincing, we also need to deal with reality. How can we design a flexible and easily replicable methodology? Would it be possible to identify the minimum requirements for an acceptable counterfactual, in terms of selection and size? In sum, is there a way to ‘impact evaluation process engineering’?


OK, that’s enough questions I guess!






Re: Impact Evaluation

Posted by Kee Beom Kim at July 27. 2010

Dear Niall,

I would agree that experimental techniques, including randomized trials, can be an excellent means evaluating programmes. I am not aware of such randomized trials in the Asia region, but in Colombia, the government's "Jovenes en Accion" programme randomly offered vocational training for 6 months (3 months in classroom and 3 months on-the-job) to disadvantaged, unemployed youth. A study titled "Training disadvantaged youth in Latin America: Evidence from a Randomized Trial" by O. Attanasio et al. finds that (copied from abstract): "We find that the program raises earnings and employment for both men and women, with larger effects on women. Women offered training earn about 18% more than those not offered training, while men offered training earn about 8% more than men not offered training. Much of the earnings increases for both men and women are related to increased employment in formal sector jobs following training. The benefits of training are greater when individuals spend more time doing on-the-job training, while hours of training in the classroom have no impact on the returns to training. Cost-benefit analysis of these results suggests that the program generates a large net gain, especially for women."

The study is available at:


Re: Impact Evaluation

Posted by Niall O'Higgins at July 28. 2010

Dear Valentina and Kee,

I am not a whole-hearted fan of experimental evaluation, although its has risen in my estimation in recent years (certainly since I wrote the book),but it certainly has  some major advantages. If I can go back to Valentina's comments. Another way of stating the two issues is that with impact evaluation one is trying to:

a) accurately measure the effects of the programme in terms of its effects on some indicator(s) - such as the chances of finding employment; and,
b) the indicator(s) need(s) to reflect something meaningful and (presumably) the programme's goals.

The Experimental approach to impact evaluation has the major advantage that, given some basic conditions, the measurement of impact is very simple. However, applicaiton of the methodology does require some thought and expense ex ante. The key thing is that one finds a group of people who can and wish to participate on a programme and then one randomly selects about half of them to do so. The other half then become the control group. One can then compare the post-programme performance of pòarticipants on the chosen indicators (e.g. the probability of being in employment).
The real difficulty is ensuring that the selection of participants really is random - there are a number of reasons why this may not be the case.
A second difficulty also arises in that one would wish to ensure that the existence of the programme itself does not affect the experiences of the control group. This second issue arises with any type of evaluation, experimental or not. However, there is the danger that this aspect gets forgotten, particularly with experimental evaluations. One approach used in, for example, the evaluation of Conditional Cash Transfer programmes in Latin America, is to apply the programme in some (randomly chosen) geographical areas (e.g. towns and/or villages) and compare the experiences of CCT recipients with the experiences of similar families in other similar towns/villages where the CCTs are not distributed One can then evaluate the impact of the programme in terms of the mean school attendance (CCTs are priniciaplly a means of subsidising school attendance) or whatever is the specific aim of the programme.
The main point is that experimental evaluation requires serious forethought and monitoring at the participant selection stage.

In any event, Valentina you were mainly concerned with the second issue. My view on the type of outcome indicator to use, would be, use several rather than a composite index of say Decent work. One loses some simplicity, however, in my view one gains significantly in terms of meaningfulness and also information. One example could be a training programme intended to increase the skills and so (post-programme) employment chances of participants. One could define  some sort of composite variable representing quality of jobs (in whatever sense including the various aspects that you mention), and measure the impact of the programme on this index. Lets say, to keep things simple, that the possible outcomes are a good job, a bad job or no job at all. One could attribute values to these outcomes - lets say 1, 1/2 and 0.(& there is the first problem  right there - one has already introduced external values into the evaluation - we are saying that a bad job is worth half what a good job is worth).  Much better to measure the two outcomes the impact on the chances of getting any job, and, the impact on getting a good job. for example, the programme raised the probability of finding some sort of job by 20%, but had no effect on the chances of finding a good job. This is more meaningful and provides more information than the statement the programme raised the index by 10%. This could then be used to pose questions about why the programme didn't increase the chances of getting a good job

The important point you raise here, is that the indicator(s) of outcome need to be carefully thought through. Once one has done this,one doesn't actually need to put them together. It is important, for example, to consider also job quality and not just jobs per se and so on. That I think is the key issue.

Anyway, thats my take on your contribution - thanks alot it was very stimulating - and thanks for the additional information Kee.





Re: Impact Evaluation

Posted by gbetcherman at July 29. 2010

Hi -- I am late joining this conversation but I have found it interesting. Based on my own work on evaluation studies and also interacting with policy makers and people implementing the programs, I think there are two big issues here: first, getting those responsible to appreciate the value of the information generated by evaluation studies and then, second, dealing with the study design issues. Although the latter issue is the one that gets most of the attention (including in this forum), it would be good to talk a bit about the former one because it is still far too common for those making decisions about ALMPs not to appreciate the importance of evaluation in the first place. In the World Bank inventory of youth program evaluations that Niall mentioned in an earlier post, it was evident that a lack of evaluation information is a big problem. Indeed, in Asia, this is worse than in may other regions. Not surprisingly, policy-makers and program managers can be threatened by evaluations -- if they see it as a "test" of their intervention and the resources they have received to implement it. This is compounded by a fairly general lack of understanding about evaluation including, for example, why gross outcome indicators are not a good measure of program effectiveness and in fact may be a misleading one. So it is important to emphasize information and education. And to frame evaluations as a learning device not an instrument of accountability. I have found that everyone becomes more responsive when they begin to see that good evaluations can help them understand how they can improve the effectiveness of their program.


Two other points on this come to mind. First, there are the related issues of financing and timing. Especially in developing countries, of course, funding for ALMPs is generally very tight. It is a tough sell get a cash-strapped minister to agree to spend a fair chunk of money on evaluation (all the more so given the problems above). I remember once encouraging a Deputy Minister of Labour (a well informed and serious official) in a large middle-income country to do a serious evaluation of their programs. To do this well, either with an experimental or quasi-experimental design, was going to cost at least $500,000. Since the total budget for ALMPs under his control was maybe $10-15 million, he just could not see allocating that amount of money to analyze the impact of his programs. Maybe shortsighted you could say: 5% on understanding what works does not seem like a lot. But when he was being judged by his bosses on how many unemployed workers were going through his programs, he could easily calculate the additional numbers that could be enrolled with these $500,000. The other thing is that a good impact evaluation needs to take a few years before it generates reliable information on what it actually does for participants. That is well beyond the time horizon of many.


I know this all may seem like a tangent but these "real-world" constraints have started to make me wonder about whether "evaluation lite" approaches can be used as a complement to full-scale evaluations and as an instrument to get decision-makers and programmers to start focusing more on how evidence can be used as an instrument of learning. By "evaluation lite". I mean using evaluation techniques that are something of a compromise between gross-outcome monitoring and rigorous (but expensive and multi-year) evaluations. The monitoring and impact evaluation instruments obviously have their place in performance management systems -- I am not arguing otherwise. But intermediate instruments (what I call evaluation lite) may also have a role. These could rely more heavily on existing administrative and survey databases and look backwards (rather than waiting for enough time to pass to assess impacts). Obviously this approach needs a lot of thinking through to understand what can and cannot be learned. But it would help to upgrade the information base that most decision-makers in developing countries are now using and would hopefully pave the way for increasing their appetite for rigorous impact evaluations. The important thing is to sell the notion of evidence-based decision-making!


The last point I want to make is closer to the discussion by others in this forum and that has to do with the content of evaluations and various methodology questions, such as the choice of indicator discussed above. My own view is that the choice of indicator needs to be driven by the objective of the program. Anything else is not fair. (That does not mean that an evaluator could not conclude that the program's objectives may not be the best and to suggest alternatives.) However, for most programs, I think that the standard outcomes -- some measure of employment and some measure of earnings -- are reasonably appropriate. After all, in most cases, the primary objectives of the intervention are to help participants get jobs and to increase their earnings. Moreover, earnings are usually not a bad proxy for job quality, if you are interested in that as well. But the other thing about content is to design an evaluation that will not only inform on cost effectiveness (let's not forget the cost side in all of this) but will also offer some promise of informing on program design and implementation and features that can be adjusted by decision-makers. Too often, evaluators emphasize the bottom line (does participation lead to better outcomes) without rigorously testing what it is about the program that determines outcomes. In other words, the program itself is often left as a "black box". Looking inside the black box is necessary to really provide information that will have practical use. This can only be done if the evaluation team works closely with the program team in the earliest stages of the evaluation design.

Gordon Betcherman

Re: Impact Evaluation

Posted by Niall O'Higgins at July 29. 2010

Thanks Gordon, several important points there I think.

On what you call evaluation lite approaches, I think you are right in that much data is collected - for example Labour Force Survey data - which can be used to identify past participation in programmes and so to construct some sort of quasi-experimental evaluation. All the more so with panel data which is now being collected (or constructed form existing rotating LFS samples). Certainly this is a fruitful road to explore and thanks for bringing it up. I myself recently used adminsitrative programme (& registered unemployment) data in evaluating a programme for the long-term unemployed in Europe although it did require some rather heroic assumptions, it was able to suggest which forms of the programme worked better and gave some indication as to why. It also was able to suggest under what assumptions the programme was cost-effective.

I might add that, even without a control group one can gain information on which forms (&/or elements) of a programme work better than others with only information about programme participants (assuming of course there is some variation in the form or elements that they receive/undergo as part of their participation). & I agree that far too little emphasis is placed on understanding why some programmes or programme elements work better than others.

The danger is, however, that these 'lite' forms of evaluation are seen as an adequate substitute for - rather than a complement to -  properly designed impact evaluations. In the example I was involved in above, the policy makers to whom the evaluation was adressed were essentially concerned with whether the programme was cost effective in their terms - i.e. whether tax revenue generated was (or rather would be) greater than the cost of the programme. The results were framed in terms of the assumptions and so on, but the bottom line that was being sought was a yes/no answer. If the lite approach is adopted, this is not really appropriate (at least not without a series of qualifications). With, in particular, experimental impact evaluation one can get closer to a fairly precise (and relatively unqualified) answer to simple questions such as did the programme increase: a) the chances of finding a job; and, b) the post-programme wages of participants - and by how much. From there one can arrive at cost effectiveness.

I guess somehow there is a balance to be found.  



Re: Impact Evaluation

Posted by Paul Ryan at July 30. 2010

I’m prompted to contribute to the discussion by some comments made by Gordon and Niall. I’m joint author with Norton Grubb of the ‘Plain Talk …’ volume on evaluation to which Niall referred.


I’d like to back Gordon’s advocacy of ‘evaluation lite’. Academically oriented evaluators tend to make the ‘best’ the enemy of the ‘good’, which actually encourages the dominance of the ‘poor’. What I mean is that the ‘best’ evaluation methods, including social experiments and sophisticated econometric techniques, are complex and expensive, and therefore often not feasible, in low-income countries in particular. But ‘no evaluation’ or ‘gross outcomes’ evaluation are at best uninformative, and potentially misleading. Intermediate, ‘good’ options are available instead. They include simple ‘difference in difference’ methods: i.e., comparisons of the change in outcomes for participants, from before participation to after participation, to the change in outcomes over the same period for a reasonably similar group of non-participants – a comparison that potentially avoids the weakness of simple ‘before and after’ comparisons. (Data on the change in outcomes for non-participants have to be available from existing surveys of households or young people.)


Caceres Cruz used such an approach to evaluate Chile Joven, a labour market training programme in Chile in the 1990s. That evaluation, which we featured in ‘Plain Talk …’, is open to various objections, but to my mind its findings to a long way toward establishing the merits of the programme in question.


To re-use a metaphor, the goal of reaching ‘Rome’ (perfect evaluation) remains, but it is better to get most of the way there in a Rentawreck car (I used to own a Citroen Deux Chevaux) than to abandon the journey for lack of an Aston Martin.


My second point is that evaluation methods that rely on individual micro data are potentially distorted by their inability to control for displacement: i.e., the displacement of non-participants by participants in employment after the programme. This is of particular concern for work-based training, which tend to succeed by creating links between participants and the employers who sponsor their training, to the detriment of any non-participants who would have been hired had the programme not existed.


Finally, I’d like to advocate institutional development in preference to labour market programmes – particularly when it comes to long-term social and economic development rather than short-term crisis management. The European countries that have fostered apprenticeship training as part of their vocational education systems have not had to use labour market programmes as extensively as have their counterparts that lack, or have neglected, apprenticeship training. It is true that the institutional networks (e.g., employer co-operation, social partnership) on which successful apprenticeship systems rest are often missing and hard to develop, as Jooyeon Jeong showed for South Korea’s attempt to import a Germanic work-based training system in the 1990s. But the successes achieved by France and Ireland in recent years in expanding and improving their moribund apprenticeship systems suggest scope for innovation even in less institutionally favourable environments.


Paul Ryan

Powered by Ploneboard
Document Actions
Forums Disclaimer
Posts will be reviewed for inappropriate content and can be removed. Please treat other members with respect. Documents published on this portal are not necessarily endorsed by or reflect the views of the ILO.
Background note to the consultation


Click on the image to read the background note.


Consolidated replies from past discussions

Click here to read the consolidated replies from previous online discussions.

consolidated replies