Manual classification of a sample of charities¶

One of the first tasks the research team conducted was to manually classify a sample of charities, assigning the organisations a single ICNP/TSO category and as many ‘tags’ as were applicable.

The classification exercise took place over several weeks, with each researcher allocated a random batch of organisations which they worked through. Frequent meetings between researchers took place in order to compare notes, discuss any tricky or ambiguous cases, and agree on shared rules for dealing with them.

Inter-coder reliability¶

The UK-CAT was developed in parallel with the creation of the manually classified dataset. Although the most significant changes occurred early on, this made it difficult to apply a stringent test of inter-coder reliability using the tags. Instead, weekly meetings between the coders were used to help clarify ambiguous cases and further refine the system. In addition, the main function of applying the tags to the manually classified dataset was to develop the UK-CAT list itself, as well as to help generate potential key words during the keyword classification stage.

When applying the ICNP/TSO, because we began with the coding scheme ready formed, and because we applied only one category per charity, it was easier to test the inter-reliability of the coders half way through the creation of the manually classified dataset.

For this purpose, the three coders each coded 50 charities using the ICNP/TSO categories based on their name, activities and objects. All three coders achieved the same result for 60 per cent (30) of the cases. Two out of three matched exactly for 28 per cent (14) of cases and all three used different categories for 12 per cent (six) of the charities.

In a few cases, the disagreements were simple mistakes. Much more commonly though, a charity could be reasonably applied to more than one ICNP/TSO category, especially where no obviously applicable category exists in the scheme. Partly, this reflects some of the concerns with the INCPTSO categories raised in the introduction and which motivated the development of the UK-CAT. On the other hand, classification is inherently subjective, so it would be misleading to suggest that in each case there is necessarily only one ‘correct’ answer.

Regardless, the exercise does introduce a reasonable level of caution concerning the training set, discussed further in the machine learning section later on.