Dear Avid: Computers are a Thing and you should use Them
Dear Avid,
Let me tell you about computers. Computers are amazing. They can do lots of things that humans can do. They excel at boring and repetitive, data-driven tasks. I'd like you to realize this, because your program, Avid, runs on a computer, and should probably take advantage of that fact one of these days.
Herein I'm going to lay out some things that computers can do, some things that Avid already does, and some things that people pay thousands of dollars for people to do which Avid could do because: Computers. This will get into some very "big" ideas for the future of software, hardware, and video production and post-production. There will be no tl;dr.
--
Computers can do voice recognition. I can use Dragon speech to text to transcribe my footage, but Avid could just do this for me.
Computers can do facial recognition. The camera on my phone knows where faces are and facebook can figure out who that face belongs to. Avid could do this, too, and automatically log and categorize footage based on who appears in it. Combined with voice recognition, Avid could generate transcripts by character and automatically generate script sync modules for every clip, ever.
Computers are really good at comparing two values. Say, for example, the audio data in one file versus the audio data in another. Plural Eyes already does this, but Avid could do it, too, and automatically sync all of my audio files. Using facial recognition and voice recognition and scripts, it could compare the audio data to existing scripts and determine if a new file was actually from a different day, or could find the same audio on a cleaner mic, or could do lip-reading analysis to match an audio-only file to a video-only file.
Computers are really good at boring and repetitive tasks. For example, the way to multigroup some footage is to put all footage from your shoot into a sequence with each camera on a different video track (and corresponding audio on the respective audio tracks), sync it, add edits across all tracks. The easy (but not as good) way is to do this at all head frames. A computer could almost instantly do this, without error. The more advanced way is to add edits to head frames only when the previous clip-end-frame was a tail frame. A computer could also do this with near instantaneous speed, and again, without error. Then, to finish grouping, you have to subclip every possible subclip in the timeline post-add-edits-being-added, and assign the timeline TC to each subclip's AuxTC. Again, a computer could do this step, flawlessly, and with great speed.
What I'm trying to say is, because of computers, Avid could sync all of your footage, generate a transcript of it, catalogue who is in each shot, find the cleanest audio for each character, find which camera angles show each character's faces, and could do this all with existing technology, automatically. Then it could combine all of this data and footage into multigroups, all on its own. What I'm saying is that 90% of the thing we call "Assistant Editing" could all be done by a computer. More quickly, more effectively, and more cheaply.
But we don't need to stop there. Computers can do all of these things so well that we could go even farther. If a computer can identify footage, place it in a timeline, sync it, log it, and group it all on it's own, why not just have computers generate the timelines and multigroups dynamically and on the fly? As in: Let's eliminate the concept of timelines and groups and clips and all that. Let's have a Stream. Footage comes into the system, Avid scans it over for video and audio data, and then places it into the stream, at the right point in time relative to all other footage. This stream is instantly updated to all users, and can be viewed in Timeline form, or Group form, or some kind of other wicked awesome form that I can't even imagine because the Stream doesn't exist yet. A dynamic footage stream would eliminate the need to fix groups because they were missing a clip. It would eliminate the need for over-cutting edited sequences with that fixed group. It would mean that editors could start working the moment that the first tape was ingested with no fear of the footage "not being ready."
That's all stuff that computers can do right now, and that Avid could do if they got out of the engineer mindset and hired some user experience coders, preferably ones who excel at efficiency and understand production timetables and budgets.
--
But we can go even farther if we consider that Avid could influence hardware manufacturers. Cameras record meta-data to every clip that includes TC, camera ID, etc. They could also record positional data (get on it, Sony) and Avid could dynamically generate a 3D map of the shooting location and where all cameras in your Stream are located at any given moment, what direction they're facing, field of view, etc.. Combined with facial recognition, we could generate a map of where the characters are in 3D. This would allow for a 3D view to be used as camera selection so you can see instantly which characters are seen at what times and in what angles, and the editor could pick with a single click which camera to cut from. Instead of viewing multi-cam, they could view the Stream in Map mode.
But what about broll? It's often shot out of chronological order so placing it in the Stream isn't helpful. Well, Avid can flag shots that don't register any of the characters' faces. Someone can curate the shots as broll, or the camera op can enter broll as metadata when recording, which Avid could use. Then avid could use the same software used for facial recognition to look at objects, and combine it with location and position data to figure out if a shot is broll from a certain location or of a certain object. That can be superimposed over the Stream and Map (or put into its own Stream) so that editors can see - in addition to their primary footage - all of the broll shot at and around that location regardless of the time of day that it was shot.
In fact, location data would allow editors to see all footage from a location, regardless of its place in the Stream, so cheats and substitute shots could be quickly and easily retrieved. I'm scrolling through the Stream, but at a fixed location so I'm only seeing footage from that location rather than from the entire shoot.
Of course, positional data relies on camera hardware changes. That's not Avid's department. But it's a thing that Avid should be actively encouraging in camera manufacturers.
--
Let's get back to the real, here and now, things that Avid could be doing. Getting away from the lofty goals of a seamless and instantaneous Stream and 3D location-based footage browsing, let's talk tiny:
Did you know that your bins already have unique identifiers which are automatically generated by the computer whenever they are created? That's how the bin knows if a duplicate copy of the same bin is being opened. Did you know that if you make four bins named "Act 2" that the attic won't save all four of them because it references the name of the bin rather than the unique identifier? Yeah.
There's also a big problem with end-users. We're not smart. We name bins "Act 2" when we damn well know we're doing a 26 episode season and every episode has an Act 2. Avid could enforce some bin data in the project window such as the creation Date, Episode number, Show ID, and so forth. In fact, the Lead AE could set avid to "name all bins with" and then generate a bin name template. On my shows, we do <date> <show ID> <episode number> as a prefix to all bins. This could be automatic.
Inside bins, sequences are dumb. They could be made smarter, using similar naming conventions, or by adding columns for things like "last touched by". Modified date and time shouldn't change unless you actually change an edit. Moving your mark in point is not a modification of a cut, Avid. It's just not.
Also, why doesn't Avid generate user logs across ISIS? I'd love to be able to see "what episodes are actively being worked on?" or "Has Editor FictionalGuyWhoDoesn'tWork made any actual edits today?" Avid could simply collate .lck files to see who has what open. Speaking of... Avid could also do that to figure out which locks are old and eliminate them automatically. Something like: Editor Jim isn't logged into Avid, but he still has some locks that were not set by selecting "lock bin" from the menu. Those must be locks that resulted from a crash - let me remove those and compare the data in those bins to what was saved in the attic.
Getting back to templates: Avid has access to my computer's data about date and time. It knows how long a sequence is. It could easily store data for a Show Name and Episode Name. So why can't I generate titles that use this data, and create a slate template that's automatically filled in with the fundamental information that goes on every single output? Being able to add data fields to titles that updated automatically would be a great time-saver (as well as a mistake saver).
Speaking of titles, why no spellcheck?
And what about exporting all of the same sorts of files that you can import? It's a pain in the ass to generate new graphics in Avid since you have to generate them as movies and then merge them using After Effects. Avid can read a file with an alpha channel... why can't it make one?
Why does ProTools automatically generate .WAV files named in the format "Track Number_File Name_Take Number" but Avid imports .WAV files expecting the format to be "Take Number_File Name_Track Number"? It's asinine.
Get your shit together, Avid. The future is now, or at least, it could be if you took a moment to consider that Computers are a Thing.
I will now return you to your regularly scheduled bitching about bugs and the complicated work-arounds for them.
-Judd
Let me tell you about computers. Computers are amazing. They can do lots of things that humans can do. They excel at boring and repetitive, data-driven tasks. I'd like you to realize this, because your program, Avid, runs on a computer, and should probably take advantage of that fact one of these days.
Herein I'm going to lay out some things that computers can do, some things that Avid already does, and some things that people pay thousands of dollars for people to do which Avid could do because: Computers. This will get into some very "big" ideas for the future of software, hardware, and video production and post-production. There will be no tl;dr.
--
Computers can do voice recognition. I can use Dragon speech to text to transcribe my footage, but Avid could just do this for me.
Computers can do facial recognition. The camera on my phone knows where faces are and facebook can figure out who that face belongs to. Avid could do this, too, and automatically log and categorize footage based on who appears in it. Combined with voice recognition, Avid could generate transcripts by character and automatically generate script sync modules for every clip, ever.
Computers are really good at comparing two values. Say, for example, the audio data in one file versus the audio data in another. Plural Eyes already does this, but Avid could do it, too, and automatically sync all of my audio files. Using facial recognition and voice recognition and scripts, it could compare the audio data to existing scripts and determine if a new file was actually from a different day, or could find the same audio on a cleaner mic, or could do lip-reading analysis to match an audio-only file to a video-only file.
Computers are really good at boring and repetitive tasks. For example, the way to multigroup some footage is to put all footage from your shoot into a sequence with each camera on a different video track (and corresponding audio on the respective audio tracks), sync it, add edits across all tracks. The easy (but not as good) way is to do this at all head frames. A computer could almost instantly do this, without error. The more advanced way is to add edits to head frames only when the previous clip-end-frame was a tail frame. A computer could also do this with near instantaneous speed, and again, without error. Then, to finish grouping, you have to subclip every possible subclip in the timeline post-add-edits-being-added, and assign the timeline TC to each subclip's AuxTC. Again, a computer could do this step, flawlessly, and with great speed.
What I'm trying to say is, because of computers, Avid could sync all of your footage, generate a transcript of it, catalogue who is in each shot, find the cleanest audio for each character, find which camera angles show each character's faces, and could do this all with existing technology, automatically. Then it could combine all of this data and footage into multigroups, all on its own. What I'm saying is that 90% of the thing we call "Assistant Editing" could all be done by a computer. More quickly, more effectively, and more cheaply.
But we don't need to stop there. Computers can do all of these things so well that we could go even farther. If a computer can identify footage, place it in a timeline, sync it, log it, and group it all on it's own, why not just have computers generate the timelines and multigroups dynamically and on the fly? As in: Let's eliminate the concept of timelines and groups and clips and all that. Let's have a Stream. Footage comes into the system, Avid scans it over for video and audio data, and then places it into the stream, at the right point in time relative to all other footage. This stream is instantly updated to all users, and can be viewed in Timeline form, or Group form, or some kind of other wicked awesome form that I can't even imagine because the Stream doesn't exist yet. A dynamic footage stream would eliminate the need to fix groups because they were missing a clip. It would eliminate the need for over-cutting edited sequences with that fixed group. It would mean that editors could start working the moment that the first tape was ingested with no fear of the footage "not being ready."
That's all stuff that computers can do right now, and that Avid could do if they got out of the engineer mindset and hired some user experience coders, preferably ones who excel at efficiency and understand production timetables and budgets.
--
But we can go even farther if we consider that Avid could influence hardware manufacturers. Cameras record meta-data to every clip that includes TC, camera ID, etc. They could also record positional data (get on it, Sony) and Avid could dynamically generate a 3D map of the shooting location and where all cameras in your Stream are located at any given moment, what direction they're facing, field of view, etc.. Combined with facial recognition, we could generate a map of where the characters are in 3D. This would allow for a 3D view to be used as camera selection so you can see instantly which characters are seen at what times and in what angles, and the editor could pick with a single click which camera to cut from. Instead of viewing multi-cam, they could view the Stream in Map mode.
But what about broll? It's often shot out of chronological order so placing it in the Stream isn't helpful. Well, Avid can flag shots that don't register any of the characters' faces. Someone can curate the shots as broll, or the camera op can enter broll as metadata when recording, which Avid could use. Then avid could use the same software used for facial recognition to look at objects, and combine it with location and position data to figure out if a shot is broll from a certain location or of a certain object. That can be superimposed over the Stream and Map (or put into its own Stream) so that editors can see - in addition to their primary footage - all of the broll shot at and around that location regardless of the time of day that it was shot.
In fact, location data would allow editors to see all footage from a location, regardless of its place in the Stream, so cheats and substitute shots could be quickly and easily retrieved. I'm scrolling through the Stream, but at a fixed location so I'm only seeing footage from that location rather than from the entire shoot.
Of course, positional data relies on camera hardware changes. That's not Avid's department. But it's a thing that Avid should be actively encouraging in camera manufacturers.
--
Let's get back to the real, here and now, things that Avid could be doing. Getting away from the lofty goals of a seamless and instantaneous Stream and 3D location-based footage browsing, let's talk tiny:
Did you know that your bins already have unique identifiers which are automatically generated by the computer whenever they are created? That's how the bin knows if a duplicate copy of the same bin is being opened. Did you know that if you make four bins named "Act 2" that the attic won't save all four of them because it references the name of the bin rather than the unique identifier? Yeah.
There's also a big problem with end-users. We're not smart. We name bins "Act 2" when we damn well know we're doing a 26 episode season and every episode has an Act 2. Avid could enforce some bin data in the project window such as the creation Date, Episode number, Show ID, and so forth. In fact, the Lead AE could set avid to "name all bins with" and then generate a bin name template. On my shows, we do <date> <show ID> <episode number> as a prefix to all bins. This could be automatic.
Inside bins, sequences are dumb. They could be made smarter, using similar naming conventions, or by adding columns for things like "last touched by". Modified date and time shouldn't change unless you actually change an edit. Moving your mark in point is not a modification of a cut, Avid. It's just not.
Also, why doesn't Avid generate user logs across ISIS? I'd love to be able to see "what episodes are actively being worked on?" or "Has Editor FictionalGuyWhoDoesn'tWork made any actual edits today?" Avid could simply collate .lck files to see who has what open. Speaking of... Avid could also do that to figure out which locks are old and eliminate them automatically. Something like: Editor Jim isn't logged into Avid, but he still has some locks that were not set by selecting "lock bin" from the menu. Those must be locks that resulted from a crash - let me remove those and compare the data in those bins to what was saved in the attic.
Getting back to templates: Avid has access to my computer's data about date and time. It knows how long a sequence is. It could easily store data for a Show Name and Episode Name. So why can't I generate titles that use this data, and create a slate template that's automatically filled in with the fundamental information that goes on every single output? Being able to add data fields to titles that updated automatically would be a great time-saver (as well as a mistake saver).
Speaking of titles, why no spellcheck?
And what about exporting all of the same sorts of files that you can import? It's a pain in the ass to generate new graphics in Avid since you have to generate them as movies and then merge them using After Effects. Avid can read a file with an alpha channel... why can't it make one?
Why does ProTools automatically generate .WAV files named in the format "Track Number_File Name_Take Number" but Avid imports .WAV files expecting the format to be "Take Number_File Name_Track Number"? It's asinine.
Get your shit together, Avid. The future is now, or at least, it could be if you took a moment to consider that Computers are a Thing.
I will now return you to your regularly scheduled bitching about bugs and the complicated work-arounds for them.
-Judd
All fantastic points that Avid would do well to consider in order to stay competitive and create a real modern world class editing system. That said, this sort of automated workflow would all but eliminate AE's, wouldn't it? Much of the time AE's are paid for is spent doing all this stuff, particularly grouping. As an AE who hasn't made the jump to Editing, I'd be concerned about my job if Avid got on this. Haha. Or maybe the opposite would happen and AE's would finally go back to being actual Assistant EDITORS rather than bargain-priced IT people.
ReplyDeleteIt's true that AEs have been relegated largely to grouping monkeys these days, but the fact of the matter is that an AE is still needed to physically push the tapes, troubleshoot the system, run outputs, build graphics templates, and then, of course, should my utopian plan come to pass, would be needed to verify that the computer didn't totally muck everything up.
ReplyDeleteBesides, PluralEyes is a real thing that exists, and you still have a job, right?
--
Another thing to add to this post: If Avid had facial recognition, we could just say "blur this guy" and then Avid could blur him everywhere he appeared in the entire cut. God damn I'd love that.
Long before I was cutting on AVID, when I was making the transition from film to videotape, I was able to perform a clean and trace on my EDL in order to recreate a new master EDL based on my edit history, down through all the generations of tape that I had cut. Now, 30 years later, and a mixdown in AVID is still a black, impenetrable box. Let's start with 30 year old technology before we start reaching for facial recognition. :)
ReplyDeleteHowdy folks - as it turns out, grouping has been completely automated now too with a service called Group It For Me! It actually creates the most efficient groups by NOT putting an add edit at the head and tail of every clip in the sequence which actually has a phenomenal impact on the processing speeds of groups - especially when there's multiple camera-people out there with a severe case of Parkinson's (ie buttons off and on a cray cray amount of times).
ReplyDeleteMy company uses this technology, and many of my colleagues around the world are using this in their workflows and lo and behold - we are all still employed.... in fact, I/we are loving the fact that our time can now be utilised to do what you guys are actually talking about - actually assisting with the edit..... which is what we're all actually passionate about, right?
They are actually offering 10 free processes to new signups to start you off so if you haven't checked it out, I'd go give it a crack. You've got nothing to lose, and HOURS to gain lol :)
PS: love the note about AE's becoming actual assistant EDITORS rather than bargain-priced IT people lol!!!
EDIUS has multicam in a stream. Might not be automatic, but at least you can resync without fucking about with new groups and match frame.
ReplyDelete