Video Encoding, Still Slow in 2014

Over at Oasis Digital, some of us work together in our St. Louis office, while others are almost entirely remote. I’ve written before about tools for distributed teams, and we’ve added at least one new tool since then: talk while drawing on a a whiteboard, record video, upload it to where the whole team can watch it. There are fancier ways to record a draw-and-talk session (for example, record audio of the speaker, and video of drawing on a Wacom tablet), but it’s hard to beat the ease of pointing a video camera and pressing Record.

This is effective, but I am disappointed by the state of video encoding today.

Good: Recording with a Video Camera

We tried various video cameras and still cameras with a video feature. There are problems:

  • Some cameras have a short video length limit.
  • Some cameras emit a high-pitched noise from their autofocus system, picked up by the microphone. (The worst offender I’ve heard so far is an Olympus camera. Many other cameras manage to autofocus quietly.)
  • Some cameras have a hilariously poor microphone.
  • Many cameras lack the ability to accept an external microphone.
  • Some cameras have a hard time focusing on a whiteboard until there is enough written on it.
  • Many cameras lack the depth of field to accommodate both a whiteboard and a person standing in front of it.

For example, I recorded a whiteboard session yesterday, 28 minutes of video, using a GoPro camera at 1080 HD (so that the whiteboard writing is clear). The GoPro camera did a good job. I’d prefer it have a narrower field of view (a longer lens) to yield a flatter, less distorted image, but it is acceptable for this impromptu daily use.

The GoPro produces high quality, but poorly compressed, MP4 video. In this example, the file size was ~5 GB for 38 minutes.

Bad: Uploading and Downloading

The question is: how to provide this video to others? We have a good internet connection at headquarters, but 5 GB still takes a while to upload and download. Even if the transfer speed was greatly improved, as a person who remembers computers with kilobytes of memory, I find 5 GB for 38 minutes morally offensive.

Good: Re-Encoding to Reduce File Sizer

The answer of course, is to re-encode the video. A better encoder can pack 38 minutes of HD video into much less than 5 GB. Keeping the resolution the same, with common encoding systems this 5 GB is easily reduced by 80% or more. If I’m willing to give up some resolution (1080 to 720), it can be reduced by 90%.

By the way, I describe this as a re-encoding rather than transcoding because most often we use the same encoder (H.264) for both. Sometimes we use Google’s VP8/9/WEBM/whatever, as that sounds like it might be “the future”.

Bad: Re-Encoding in the Cloud

I love the idea of making re-encoding someone else’s problem. Various companies offer a video encoding service in the cloud; for example Amazon has Elastic Transcoder. But that doesn’t solve the problem I actually have, which is that I want to never upload 5 GB in the first place.

Good: Ease of Re-Encoding / Transcoding

There are plenty of tools that will perform the re-encoding locally with a few clicks. I have used these recently, and many more in the past:

Worse: Encoding Speed

With the computers I most often use (a MacBook air, a Mac Mini, various others) the encoding time for good compression is a multiple of real time. For example, my experience yesterday which led to this blog post was 38 minutes of video re-encoding in 3+ hours on my Air. (It would’ve been considerably faster than that on the faster Mac Mini, but still a long time).

Is There a Good Solution?

I’m hoping someone will point me to a piece of hardware that does high quality transcoding to H.264 or WEBM, at real-time speed or better, for a not-too-painful price. It seems like such a thing should exist. It is just computation, and there are ASICs to greatly speed some kinds of computations (like BitCoin mining) at low cost. There are GPUs, which speed video rendering, but from reviews I have seen the GPU based video encoders tend to do a poor-quality job, trading off video quality and compression rate for speed. Is there a box that speeds of transcoding greatly, while keeping high quality and high compression? Or the the only improvement a very fast (non-laptop) PC?

 

Lua Doesn’t Suck – Strange Loop 2010 video

At Strange Loop 2010, I gave a 20 minute talk on Lua. The talk briefly covered six reasons (why, not how) to choose Lua for embedded scripting. Lua is safe, fast, simple, easily learned, and more popular that you might expect.

The Strange Loop crew only recorded video in the two largest venues (out of six), so I made a “bootleg” video of my talk, for your viewing pleasure:

video
play-sharp-fill

The video/audio sync starts out OK, but drifts off by a second or so by the end. The drift is minor, so it is reasonably viewable all the way through. If you don’t have Flash installed (and thus don’t see the video above), you can download the video (x264); it plays well on most platforms (including an iPad).

The slides are available for PDF download.


Video Hackery

This video recording was an experiment: instead of hiring a video crew (with professional equipment), or using my DV camcorder, I instead used the video recording capability of my family’s consumer-grade Canon digicam. This device has three advantages over my DV camcorder:

  1. No tape machinery; no motors; thus no motor noise in the audio.
  2. Smaller size, easier to carry in and out.
  3. Directly produces a video file, easily copied off its SD card.

As you can see from the results, the video quality is adequate but not great. Still, I learned that if I want to increase the quality of recording, the first step is not to use a better camera or lens! Rather, it is to bring (or persuade the venue to provide) better light. For good video results, the key is light the speaker well, without shining any extra light on the projector screen. With that in place, a better camera make sense.

The audio was a different story. Like nearly all consumer video cameras (and digicams with video), mine doesn’t have an external audio input, so the audio (from ~12 feet away) was awful. As a backup I had used a $75 audio recorder and a $30 lapel microphone, and that audio is very good, certainly worth using instead of the video recording audio track.

To combine the video in file A with the audio in file B, I used the ffmpeg invocation below. I reached the time adjustments below in just a few iterations of trial and error, by watching the drafts in VLC, using “f” and “g” to experiment with the audio/video time sync. I also trimmed off a bit of the bottom of the video, and used “mp4creator.exe -optimize”, which I had handy on a Windows machine, to prepare the file for progressive download viewing.

ffmpeg -y -ss 34.0 -i WS_10001.WMA -ss 34.0 -itsoffset -12.05 -i MVI_4285.AVI -shortest -t 8000 -vcodec libx264 -vpre normal -cropbottom 120 -b 400k -threads 2 -async 200 Cordes-2010-StrangeLoop-Lua.m4v

The remaining bits of technology are FlowPlayer, a WordPress FlowPlayer plugin, and a CDN.

SaaS: The Business Model – Video

On Feb. 27 at St. Louis Innovation Camp 2010, I gave a talk on the SaaS business model. I posted the slides, handout, audio, and transcript soon thereafter. Here is the 44-minute video the talk, conveniently on YouTube:

But until I revisited this page in 2020, the video situation was much more complex. It took three months (back in 2010) to post.

video
play-sharp-fill

Warning: Sausage-making Discussion Below

The following has nothing to do with the content of the video.

This is an x.264 video, shown here initially with a Flash-only player (FV WordPress Flowplayer). Later I’ll replace this Flash-only widget with one that offers HTML5 video (for iPad use, in particular), when I find one that works sufficiently well.

That’s the easy part, though. Getting this video to you here was an adventure, and not in a good way. Three recordings were made of the talk:

  1. We hired a professional videographer to record the talk. When I say professional, I mean it only in the most literal way, i.e. the videographer charged money. They showed up with a nice camera and a wireless lapel mic… but somehow produced a broken video recording (the first 10-15 minutes were intermittent video noise). In addition, the mic gain was turned up way too high and thus the audio is awful.
  2. Dave Blankenship recorded the talk on his consumer camcorder; he was not paid for this, yet he did a much better job. This video is usable all the way through, but arrived in an oddball format produced mostly by some models of JVC camcorders. The audio was not so hot, because he used the mic built in to the camcorder from the back of the room.
  3. I recorded the audio using a $5 microphone plugged in to an iPod Nano, sitting on a table at the front of the room. It’s a bit noisy, but with a few minutes of work with Audacity (Noise Removal and Normalization), the results are much better than either video attempt.

Armed with this, I set about to somehow combine the video from #2 with the audio from #3. I send emails describing this mess to several videographers I found on Craigslist. Most of them didn’t reply at all. I finally got a cost estimate from one, of many hundreds of dollars or more, and not much assurance of results.

Now I’m willing to spend some money to get good results, but spending it without confidence of results is less appealing; so I set about trying myself instead.

First, I cleaned the audio in Audacity as mentioned above.

Second, I watched the video and listened to the audio a few times, to get the approximate starting timestamp in each one of the moment the talk actually started; each recording had a different amount of lead-in time

Third, I grabbed ffmpeg, the swiss army knife of command line video and audio processing. After reading a dozen web pages of ffmpeg advice, and a number of experiments (with short -t settings, to quickly see how well it works without waiting to transcode the whole thing), I ended up with this command to produce the encoded video:

ffmpeg -y -ss 40.0 -i Recording-3-audio-only-clean.wav -ss 95 -i Recording-2-video-ok-audio-bad.mod -shortest -t 18000 -vcodec libx264 -vpre normal -b 700k -threads 2 Cordes-2010-SaaS.m4v

I then noticed that the MacPorts installation of ffmpeg omits the important qt-faststart tool, and found this helpful version of qt-faststart and used it instead, on my Mac; later I switched to a Linux machine with an ffmpeg install including qt-faststart. Without the faststart step, the metadata in the m4v file is arranged in a way that prevent progressive/streaming play-while-downloading.

The results are good but not great:

  • The video has some motion/interlace artifacts; these were present in the original recording, and I’m not aware offhand of what to do about them
  • The video camera used rectangular pixels; the pixel aspect ratio is 3:2 while it is intended for display at 16:9. I wasn’t able (at least in 20 minutes of learning and experimentation) to get the 16:9 output working correctly, so if you grab the underlying m4v file you can see the aspect ratio a bit off in the shape of the clock on the wall, for example.
  • The audio-video sync is adequate (and plenty good enough to follow along) but not perfect. Clearly using the audio track on a video recording is much better than putting them together in post-processing.
  • The audio is not as good as if I used a lav or headset mic, though I think it’s quite remarkably good for a $5 mic plugged in to iPod.
  • I’ve no idea if ffmpeg complies with any of the relevant copyrights/patents/whatever in video production, though it seems hopefully safe to use for a one-off non-commercial video like this. (Normally I use Apple’s iMovie for my videos, and I assume Apple has taken care of such things.)

A few morals of this story:

  • Get some powerful tools, and learn how to use them.
  • Be willing to pay for professional work, but be skeptical. Just because you pay, doesn’t mean it will be quality work.
  • Have a plan B. If I had assumed that at least one of the two videos would get decent audio, and skipped my own audio recording, I’d not have been able to deliver the acceptable audio here. If Dave had assumed that my professional videographer would produce results, and turned off his camera, we’d have no video here at all.

Take a Strategic Vacation

This is yet another story that I’ve told dozens of time to individual and groups, and now finally written down. Here is a short video talk:

Back in 2004 I co-founded Mobile Workforce Management, a vertical market SaaS firm. For the first 6+ months, I was the entire development team, while my co-founder was the entire analysis, support, and customer happiness department. Over the course of a few years, we hired developers, a very-senior developer / leader / general manager, support staff, and more. In spite of these hires, as of 2007 I was still in the loop for numerous critical processes that had to happen every day or week to keep the doors open – not a great situation.

Around that time I was inspired to take a month-long family vacation, far longer than any past vacation. My family made arrangements to spend 3 weeks in a house by the beach, 1000 miles away, in the summer of 2008; these arrangements must be made far in advance, as such houses tend to fill up. I’d be away for approximately an entire month, allowing for travel time and stops along the way.

With that hard date in hand, my notions of ironing out the business processes “someday” were swept aside, and I set about tracking, automating, documenting, and delegating any of the work that involved me and had to happen at least monthly.

  • accounting / bookkeeping / payroll
  • production sysadmin
  • development sysadmin
  • system monitoring
  • management processes
  • customer relationship processes
  • vendor relationships
  • design and code reviews
  • much more

It took months of hard work (by myself and others) to build up our company ability to handle all of these things well in my absence. As of the vacation date, all of this was set up to run smoothly either entirely without me, or with a tiny bit of remote input from me.

This worked, in fact it worked so well that our customers didn’t even notice my absence.

Though I didn’t know it at the time, the work I did then to increase our organizational process maturity was a turning point in the life if the business, enabling its eventual sale. Before that work, I’d have been a bit embarrassed to say “organizational process maturity” in public. Afterward, I have lived (rather than just learned about and talked about) the notions of working on-rather-than-in a business, of building a business with a life separate from that of its owners.

In retrospect I’m calling that trip a Strategic Vacation – a vacation taken both for its own value, and to drive the accomplishment other critical goals. If your business needs you every single day, that’s a problem. Create some pressure on yourself to solve it, by scheduling a strategic vacation, then go make it happen.

The Prolog Story

I’ve told this story in person dozens of times, it’s time to write it down and share it here. I’ve again experimentally recorded a video version (below), which you can view on a non-Flash device here.

The Prolog Story from Kyle Cordes on Vimeo.

I know a little Prolog, which I learned in college – just enough to be dangerous. Armed with that, and some vigorous just-in-time online learning, I used Prolog in a production system a few years ago, with great results. There are two stories about that woven together here; one about the technical reasons for choosing this particular tool, and the other about the business payoff for taking a road less travelled.

In 2004 (or so) I was working on a project for an Oasis Digital customer on a client/server application with SQL Server behind it. This application worked (and still works) very well for the customer, who remains quite happy with it. This is the kind of project where there is an endless series of enhancement and additions, some of them to attack a problem-of-the-moment and some of them to enrich and strengthen the overall application capabilities.

The customer approached us with a very unusual feature request – pardon my generic description here; I don’t want to accidentally reveal any of their business secrets. The feature was described to us declaratively, in terms of a few rules and a bunch of examples of those rules. The wrinkle is that these were not “forward” rules (if X, do Y). Rather, these rules describe scenarios, such that if those scenarios happen, then something else should happen. Moreover, the rules were are on complex transitive/recursive relationships, the sort of thing that SQL is not well suited for.

An initial analysis found that we would need to implement a complex depth/breadth search algorithm either in the client application or in SQL. This wasn’t a straightforward graph search, though, rather that part was just the tip of the iceberg. I’m not afraid of algorithmic programming, Oasis Digital is emphatically not an “OnClick-only” programming shop, so I dug in. After spending a couple of days attacking the problem this way, I concluded that this would be a substantial block of work, at least several person-months to get it working correctly and efficiently. That’s not a lot in the grand scheme of things, but for this particular customer, this would use up their reasonable-but-not-lavish budget for months, even ignoring their other feature needs.

We set this problem aside for a few days, and upon more though I realized that:

  • this would be a simple problem to describe in Prolog
  • the Prolog runtime would then solve the problem
  • the Prolog runtime would be responsible for doing it correctly and efficiently, i.e. our customer would not foot the bill to achieve those things.

We proceeded with the Prolog approach.

….

It actually took one day of work to get it working, integrated, and into testing, then a few hours a few weeks later to deploy it.

The implementation mechanism is pretty rough:

  • The rules (the fixed portions of the Prolog solution) are expressed in a prolog source file, a page or two in length.
  • A batch process runs every N minutes, on a server with spare capacity for this purpose.
  • The batch process executes a set of SQL queries (in stored procs), returning a total of tens or hundreds of thousands of rows of data. SQL is used to format that query output as Prolog terms. These stored procs are executed using SQL Server BCP, making it trivial to save the results in files.
  • The batch process run a Prolog interpreter, passing the data and rules (both are code, both are data) as input. This takes up to a few minutes.
  • The Prolog rules are set up, with considerable hackery, to emit the output data we needed in the form of CSV data. This output is directed to a file.
  • SQL Server BCP imports this output data back in to the production SQL Server database.
  • The result of the computation is thus available in SQL tables for the application to use.

This batch process is not an optimal design, but it has the advantage of being quick to implement, and robust in operation. The cycle time is very small compared to the business processes being controlled, so practically speaking it is 95% as good as a continuous calculation mechanism, at much less cost.

There are some great lessons here:

  • Declarative >>> Imperative. This is among the most important and broad guidelines to follow in system design.
  • Thinking Matters. We cut the cost/time of this implementation by 90% or more, not by coding more quickly, but by thinking more clearly. I am a fan of TDD and incremental design, but you’re quite unlikely to ever make it from a handcoded solution to this simply-add-Prolog solution that way.
  • The Right Tool for the Job. Learn a lot of them, don’t be the person who only has a hammer.
  • A big problem is a big opportunity. It is quite possible that another firm would not have been able to deliver the functionality our customer needed at a cost they could afford. This problem was an opportunity to win, both for us and for our customer.

That’s all for now; it’s time for LessConf.

When Will It Ship? Estimates and Promises

I’m trying something new with this post: a short video presentation of approximately the same content.


Here is an area of confusion that has come up both at Oasis Digital, and at every other firm I’ve worked:

estimate ≠ promise

Background: Unpredictability

Around half of my software development and leadership experience has been in enterprise/internal software development, and that is the world I am thinking of as I write this.

Software development, like other endeavors with a significant creative component, is inherently unpredictable. With a good, deep understanding of the development process, you can build a model of the probability distribution of the cost, effort, and elapsed time for software development work. In the large, on average this can be made to work: small and large projects can succeed, within some broad range of predictability.

But notice also how common it is for large complex projects (in software and elsewhere) to be farcically over budget and late. This is not (usually) due to incompetence or fraud. It is because of the inherent unpredictability of the work.

If someone claims that they (or you) can exactly predict software development work, they are:

  • mistaken, or
  • lying, or
  • padding their estimates very substantially, stating a date or cost much later/higher than a neutral median estimate would suggest

As a customer of software development services, and as a provider of such services, I don’t want any of those things.

Blame the Service Trades

I place some of the blame for the confusion of these two wildly different things, on the service trades: it is common for auto repair shops, roof installers, landscapers, and the like to offer something they call an estimate, but which is actually a fixed price quote (a promise).

Sadly, while there are plenty of common good synonyms for promise, there aren’t many for estimate. We’re stuck with using the word estimate, and explaining that we really mean it as defined. Perhaps in a few more decades we will lose the word entirely, much like the word “literally” has come to mean its antonym, “figuratively”, which renders it mostly useless.

Estimates

An estimate is an approximation of an unknown quantity. Typically in the world of software development, it is a prediction of the cost, working hours, or delivery date of a project or milestone. It is not in any sense a commitment, any more than estimating the temperature outside this afternoon is a commitment.

As the word implies, a customer reasonably expects the actual value to vary somewhat, in either direction, from the estimate. In fact, if an estimate turns out exactly match the actual result, there is a good chance the books have been cooked. Moreover, if the work is completed at-or-before the estimate most of the time, this means the estimates (on average) are too high.

An estimate “costs” nothing, other than the time/effort required to create it, which consists of analyzing the work at hand, decomposing it in to parts, and comparing those parts to past work.

Promises

A promise, also called a commitment, deadline, quote, fixed price, etc. is a different beast entirely.

With a promise in hand, a customer should expect with high confidence that the actual value (for cost, hours of work, delivery date) will be less than (before), or equal to, the promised value/date.

Be wary of a promise easily made and freely given: it probably doesn’t mean anything at all. A wise customer (and I aim to count myself in this category) should expect that a casually made commitment will probably be broken; not because the maker is morally defective, but simply because meeting a commitment for complex work requires considerable effort and thought. Without evidence that happens, it would be mere wishful thinking to expect the results delivered as promised.

Likewise, keeping promises often has a cost. If the work underway gets behind the schedule needed to meet the promise, something will have to give:

  • Other work may fall behind, as time and effort are diverted to meet the promise.
  • Weekends, evenings, and overtime work may be needed. These might appear free, but are not.
  • Staff may need to be reassigned, or added
  • Additional hardware and software may be needed.

These risks cost real money; thus a wise promise-maker will find that, on average, it costs more to promise feature X by date D, than to delivery feature X by date D without such a promise.

Estimates are Cheaper, so Prefer Estimates

At Oasis Digital, we provide many estimates, but few promises. Most of the time, an estimate is what our customers need; and we can provide at estimate with very little cost. Typically we estimate reasonably well:

  • small features usually arrive with a day or two (plus or minus) of the estimated delivery date (and likewise for cost)
  • medium items within a week or two of the estimate, likewise
  • large items (major new features or subsystems with complex interdependencoes) within a month or so, likewise

The key here is is that with good estimate, commitments (promises) aren’t needed very often, and therefore the cost of promises can be avoided.

But Learn How to Promise Well, Also

Yet occasionally, a customer needs a commitment, most often because a software version needs to be available to match an important business event with a fixed date, such as a presentation, a legal filing, etc. I’ll follow up later (no promise or estimate, as to when) with thoughts on:

  • how to credibly make promises (as a service provider)
  • how to evaluate promises (as a customer)