Speech Recognition and Closed Caption Searches for Internet Video Archives

Technology has enabled digitization of both new videos and those that predate the Internet for online streaming.  With more videos becoming available on the Internet, websites are developing speech recognition technology that may eventually lead to closed caption searches - a useful tool for canvassing video content. Google Audio Indexing for YouTube Closed Captioning

The BBC reported that Google has expanded its closed captioning technology, Google Audio Indexing (Gaudi), to all of the English language content on its subsidiary website YouTube.  Subtitles can be embedded in videos during the uploading process, and anyone can request the addition of captions for videos with clearly spoken audio tracks.  Only English language content is supported for now, but the written captions can be translated to other languages.  Intended to make YouTube accessible for the hard of hearing, this technology also has great potential for increased video searchability.


The movie and television show streaming web site Hulu is expanding its search features by including a caption search for its closed captioned programs.  Most of its news programming and talk shows do not yet have closed captioning, but Hulu may expand the feature to include all of its content.  Once this is in place, Hulu would be a useful resource for conducting research on public figures by making their television appearances easier to find.

Since the feature is still in beta testing, searches can only be performed for a one show at a time and works as follows:

  • After selecting a show or program with closed captioning, you are given the option to search under a "Captions" tab.
  • The search results will let you know which episodes the search word or phrase appears with links directly to the segment.
  • Results are also presented in a popularity graph that shows which parts of the episode have been viewed the most. Occurrences where the search term is in the episode will be highlighted in the same graph for comparison.