Google has released the public beta of Video Search. According to the help documents, the indexing is based on the closed captioning text that is included with TV programs as they air, so there their index may contain errors if the CC text is incorrect. As of now, you can only search and view thumbnails for a limited number of programs on a selective list of channels. In the future, it is possible that the videos will be playable from the search results. Let's hope.
This is a pretty great concept. I wonder if businesses will be able to purchase the Google appliance with a video module. This is sure to be a boon to visual resources libraries and archivists. As is the problem with Google Appliance, however, much of the metadata and index is actually not open to the purchaser of the Appliance, so what you can do with categorization and other metadata operations is unknown.
The fact that Google is using the CC text makes video searching seem easy for broadcast TV, but more difficult for non captioned media, including audio MP3 (MP3 id tags include a lyrics field, but that isn't utilized in iTunes or on iPods).I wonder if there is a way to make other non-text media indexable besides creating captions. Specifically, I'm thinking about voice recognition and automated captioning. Presently, I'm aware that you can create your own captions by submitting your media to a closed captioning service like AcidDVD, who will do the work for a fee. You can also use software, like CC Caption/Mac Caption, to encode the captions yourself. What I'm wanting is software to do decent initial captioning using voice recognition and allowing you to edit the captions using a video editing tool similar iMovie.
What I'm also wondering about is what the Internet Archive people have in their roadmap for future services. Since they now let you publish Creative Commons licensed video and audio to the Archive, if they eventually have access to some program that will be able to automatically generate usable text captions for things such as video blogs and podcasts, that can be used by Google Video Search as well. With the burgeoning grassroots media publishing of videoblogs and podcasts, this is what we really need.
I'm sure the indexing of audio would throw a wrench into the progress made recently with the ASCAP licensing rules, but this is the way of the future. Making all recorded media findable is probably scary to some, but on the whole it is probably a very desirable view of the search future.
Comments
02/22/05 @ 11:37
I remember when our group was looking at Virage they were already doing closed caption/ppt indexing along with the video/sound as part of their product lines. This was about 4 years ago and it was quite exciting and very useful in the education arena. The content producer had the option to have full text searching and/or metadata searching.
I think that this is a great resource to make this availabe on Google but I think what is more important is that most of it was considered "deep web" content hidden behind television programs' websites.
02/22/05 @ 14:14
Thanks for the info about Virage, ML. Now I'm looking for the real innovation, which will be some sort of voice regcognition to text that will make this media indexable without CC transcripts.
03/03/05 @ 15:02
Blinkx should be of interest. At least to see that they are doing voice recognition for video (mostly TV focused now)
Hope this helps.
Matt
03/03/05 @ 18:04
I am aware of blinkx. I didn't know they were doing anything with voice recognition, though. Will have to check it out to see if they're using transcripts, closed captions or actual software voice recognition. If it's that latter, that would be cool and I wonder if they would index video and audio in the Internet Archive. Then if everyone's podcasts were eventually archived there, that would be awesome.
Post new comment