Having spent a fair bit of time the past few weeks checking out other institution’s IRs, one thing is clearly evident – very few are using controlled subject vocabularies, except in the most rudimentary way. Most of the Australian IRs are using the Australian Research Council’s Research Fields, Codes and Disciplines (RFCD) codes. Logical, since research activity must be reported by these codes under the Higher Education Research Data Collection (HERDC) scheme; and laudable since they provide a common search point across Australian IRs.

Most IRs are also using user-suggested keywords, sometimes (but not often) supplemented by metadata specialists. Very few are using formal controlled vocabularies apart from RFCD. This is understandable, as implementing controlled vocabularies in IRs can be quite a complex undertaking.

To enumerate just a few of the problems –

  • Which controlled vocabulary to use? Different disciplines may have different preferences.
  • Not all repository software supports the building in of controlled vocabularies; so how to ensure users use recommended vocabularies in this situation?
  • If multiple vocabularies are supported by the IR (Fez comes to mind here) , how to manage them and their different user groups without overly complicating repository administration?
  • How to ensure that users select appropriate terms? One user’s “car” is another user’s “1925 Ford Model T Tudor sedan”.
  • If considering using metadata specialists to vet user-submitted terms, how to resource such a labour-intensive task, especially for potentially high volume submissions?
  • How to balance the need for making controlled vocabularies compulsory with user frustration when encountering required fields <link to follow>

So the challenge for respositories is to determine not just whether fully-fledged controlled subject vocabularies are worth using in (and building into) their IRs, but if so, which ones, and the best way to implement them with a limited amount of resources and without alienating users and compromising usability.