Bits are Free, People are Valuable

A few days ago, I caught myself thinking about whether to save some images and video; whether the likely future value of those megabytes would be greater or lesser than the cost of storage. This is a sort of thought that was important and valuable… a couple of decades ago.

Bits are Free

Today, and at least for the last decade, bits are so absurdly cheap that they can be considered free, compared to the time and energy of people. Storing gigabytes or terabytes because they might come in handy is not just for government agencies, it makes sense for all of us in our daily work.

Waste bits to save time.

Waste bits to help people.

People matter, bits are free.

Bits are Free at Work

Here are some ways I waste bits at work:

  • We often gather a few people working on project to meet or code together; it is very easy to start a video or screen video recording of the work. That recording can then be shared with anyone on the project who wasn’t present.
  • We record screenshots or videos of the future in progress, and send it to a customer. Yes, we could wait and present to them “live” using WebEx or whatever. But it is cheaper to waste the bits and conserve human coordination effort.
  • If I have something complex to explain to someone, I can do it in person, and I can do it on the phone, and I can write lots of text. But if I already know them well enough to partly anticipate their questions, I will start a video recording and explain something using voice and white board. The bits are cheaper than the coordination to work “live”. The bits are cheaper than asking them to figure it out themselves.
  • Driving down the road, it is unsafe to text, or read, or write, or (I am looking at you, morning commuters…) apply makeup. But while the numbers are unclear, we have a broad assumption that merely talking with someone while driving down the road is OK. I sometimes make use of traffic time, and burn some bits, by recording audio about some problem to be solved or other matter. The bits are free, who cares if this uses 1000x as much data as sending an email?

Wasting bits can grow somewhat extreme. In the first example above, I described a screen video of developers working together, recorded for the benefit from other developers. You might imagine this as a high-ceremony process, done on important occasions about important code. But bits are free – so we sometimes record such video even if no developer will ever pay attention to it – the value is there even if just one developer maybe lets it play in the background while they work on something else – much like by working in the same room as other people, it is common to pick up some important bit in the background. If one developer learns one small thing to make their work better, that is more valuable than 400 MB of video storage space.

Bits are Free at Home

Here are some ways I waste bits at home:

  • When I take photographs, I take a lot of photographs. One might come in handy. Who cares if 95% of them are bad and 90% of them are useless?
  • Why do cameras have settings other than maximum resolution? Bits are free.
  • Nearly 100% of paperwork I get in the mail, other than marketing, I scan, OCR, and keep in a searchable archive. The disk space for this costs almost nothing. The time saved deciding whether to keep each item, and how to file each item, is irreplaceable. I probably only ever look at 10% of what is scanned, but who cares?
  • If my kids are doing something even vaguely interesting, I try hard to remember to take photos and record video. Looking back at when they were very young, snippets of video we have (from before every phone had a decent video camera) are priceless. I can’t reach back and record more of those, but I can easily record things now that might be fun in the future. Who cares if 95% of that video no-one ever looks at? If I ever need to go through it, I can do it then. The storage space in the meantime doesn’t matter.
  • If I need to look at a model number, serial number, etc. of anything around the house, I snap a photo of it. Then I can look back at the photo anytime from anywhere. Yes, it is absurd to store 3,000,000,000 bytes of JPG rather than 20 bytes of serial number. But they both round to zero.

How Free Can Bits Get?

I expect this tradeoff will shift even more exponentially in the future. In a couple of decades, perhaps:

  • We will record 5 petabyte ultra beyond-HD holographic recordings of insignificant activities just in case someone wants to watch them.
  • We will run video and audio recording of our lives 24 x 7, to be indexed just in case anyone ever wants to look back at it.
  • Future cameras won’t even come with an on-off switch, and will instead record continuously for the lifetime of the camera.

 

HTML Syntax: Threat or Menace?

Some developers love HTML: its syntax, its angle brackets, its duplicated tag names, its scent, its silky smooth skin. If you are among them, you probably don’t won’t like this post.

I appreciate the practicality of HTML: HTML is ubiquitous, so nearly every developer already knows it. Nearly every editor and IDE already syntax-highlights and auto-formats HTML. But I don’t care for HTML’s syntax. On some projects, I use tools that offer an alternative, simpler syntax for the underlying HTML/DOM data model.

Indentation-Based HTML Alternatives

The most popular HTML alternative syntax is a Python-like indentation-based syntax, in which element nesting in determined by start-of-line white space. Implementations of this idea include Jade, HAML, and Slim.

Of those, Jade seems the most polished, and there are multiple Jade implementations available: the original (JavaScript, for use in Node or the browser), Java, Scala, Python, PHP, and possibly others. Jade looks like this:

html2jade

There is a free, helpful HTML to Jade converter online thanks to Aaron Powell, shown in the above image. It can be used to translate HTML documents or snippets to Jade in a couple of clicks.

Non-HTML HTML Advantages

  1. Jade (and other tools) trivially do lot of things right, like balanced tags, with no need for IDE support.
  2. Generated HTML will automatically be “pretty” and consistent
  3. Generated HTML will always be well formed: Jade doesn’t have a way to express unbalanced tags!
  4. Very concise and tidy syntax.
  5. Cleaner diffs for code changes – less noise in the file means less noise in the diff.

Non-HTML HTML Disadvantages

  1. Another Language to Learn: People already know HTML; any of these tools is a new thing to learn.
  2. Build Step: any of these tools needs a build step to get from the higher level language to browser-ready HTML.
  3. Limited development tool support – you might get syntax highlighting and auto-completion, but you will need to look around and set it up.

Non-Arguments

  1. There is no “lock in” risk – any project using (for example) Jade as an alternative HTML syntax, could be converted to plain HTML templates in a few hours or less.
  2. Since it is all HTML at runtime, there is no performance difference.

Conclusion

On balance, I think it is a minor win (nice, but not indispensable) to use a non-HTML HTML syntax. At work, we do so on many of our projects..

 

AngularJS $q Promise Delayer

In class a couple months ago (the Angular Boot Camp I often teach), a student asked how to do something like Thread.sleep(n) in JavaScript. Of course there isn’t such a thing, at least in the main JS execution environments (browsers, Node). Rather there is the asynchronous equivalent of setTimeout().

But there is an equivalent, nearly as terse way to insert a delay in a promise chain. Here is a short (thought perhaps not optimally short) sleep equivalent. This is for use in an AngularJS app, with its $q promise implementation.


angular.module("whatever", [])

.factory("kcSleep", function($timeout) {
  return function(ms) {
    return function(value) {
      return $timeout(function() {
        return value;
      }, ms);
    };
  };
});

// to use it:

somePromise
  .then(kcSleep(1000))
  .then(whatever);

Video Encoding, Still Slow in 2014

Over at Oasis Digital, some of us work together in our St. Louis office, while others are almost entirely remote. I’ve written before about tools for distributed teams, and we’ve added at least one new tool since then: talk while drawing on a a whiteboard, record video, upload it to where the whole team can watch it. There are fancier ways to record a draw-and-talk session (for example, record audio of the speaker, and video of drawing on a Wacom tablet), but it’s hard to beat the ease of pointing a video camera and pressing Record.

This is effective, but I am disappointed by the state of video encoding today.

Good: Recording with a Video Camera

We tried various video cameras and still cameras with a video feature. There are problems:

  • Some cameras have a short video length limit.
  • Some cameras emit a high-pitched noise from their autofocus system, picked up by the microphone. (The worst offender I’ve heard so far is an Olympus camera. Many other cameras manage to autofocus quietly.)
  • Some cameras have a hilariously poor microphone.
  • Many cameras lack the ability to accept an external microphone.
  • Some cameras have a hard time focusing on a whiteboard until there is enough written on it.
  • Many cameras lack the depth of field to accommodate both a whiteboard and a person standing in front of it.

For example, I recorded a whiteboard session yesterday, 28 minutes of video, using a GoPro camera at 1080 HD (so that the whiteboard writing is clear). The GoPro camera did a good job. I’d prefer it have a narrower field of view (a longer lens) to yield a flatter, less distorted image, but it is acceptable for this impromptu daily use.

The GoPro produces high quality, but poorly compressed, MP4 video. In this example, the file size was ~5 GB for 38 minutes.

Bad: Uploading and Downloading

The question is: how to provide this video to others? We have a good internet connection at headquarters, but 5 GB still takes a while to upload and download. Even if the transfer speed was greatly improved, as a person who remembers computers with kilobytes of memory, I find 5 GB for 38 minutes morally offensive.

Good: Re-Encoding to Reduce File Sizer

The answer of course, is to re-encode the video. A better encoder can pack 38 minutes of HD video into much less than 5 GB. Keeping the resolution the same, with common encoding systems this 5 GB is easily reduced by 80% or more. If I’m willing to give up some resolution (1080 to 720), it can be reduced by 90%.

By the way, I describe this as a re-encoding rather than transcoding because most often we use the same encoder (H.264) for both. Sometimes we use Google’s VP8/9/WEBM/whatever, as that sounds like it might be “the future”.

Bad: Re-Encoding in the Cloud

I love the idea of making re-encoding someone else’s problem. Various companies offer a video encoding service in the cloud; for example Amazon has Elastic Transcoder. But that doesn’t solve the problem I actually have, which is that I want to never upload 5 GB in the first place.

Good: Ease of Re-Encoding / Transcoding

There are plenty of tools that will perform the re-encoding locally with a few clicks. I have used these recently, and many more in the past:

Worse: Encoding Speed

With the computers I most often use (a MacBook air, a Mac Mini, various others) the encoding time for good compression is a multiple of real time. For example, my experience yesterday which led to this blog post was 38 minutes of video re-encoding in 3+ hours on my Air. (It would’ve been considerably faster than that on the faster Mac Mini, but still a long time).

Is There a Good Solution?

I’m hoping someone will point me to a piece of hardware that does high quality transcoding to H.264 or WEBM, at real-time speed or better, for a not-too-painful price. It seems like such a thing should exist. It is just computation, and there are ASICs to greatly speed some kinds of computations (like BitCoin mining) at low cost. There are GPUs, which speed video rendering, but from reviews I have seen the GPU based video encoders tend to do a poor-quality job, trading off video quality and compression rate for speed. Is there a box that speeds of transcoding greatly, while keeping high quality and high compression? Or the the only improvement a very fast (non-laptop) PC?

 

Code Reviews – Use a gate, not a drive-by

Does your team / project use code reviews? If not, I suggest starting today. There are countless resources online about how and why to do code reviews; there are numerous code review tools, open source and commercial.

Who should review code? A great question, perhaps for other blog posts.

When should code be reviewed? Hopefully also great question, and the topic of this blog post.

Context

When making a recommendation, I’m trying to get in the habit of always mentioning the context, the situation in my work from which my recommendation arises. This is in response to seeing frustration when people try to follow advice, not realizing that their situation is completely unlike that of the writer.

Here ate Oasis Digital our context is mostly complex line-of-business applications, which are expected to live for a long time and are important to the business operations of the companies that use them. We are almost always replacing an existing system (which a company relies on) or expanding an existing system (which a company relies on).

Another critical part of our context: we have flexible and fast source control, and a group of developers who have learned how to use it competently. We have decoupled source control from build automation, so that we can perform builds of branches in progress without merging them to a mainline. We can achieve a sufficient degree of continuous integration without actually putting each change directly into the same code line.

Review Code Early

How long after code has been written should it be reviewed? Hours or days, not weeks. For this reason, we commit and push code frequently (to a disposable branch of course) where others can see it for early review. It can be a challenge to get past the ego-driven desire to make code perfect before exposing it to the scrutiny of others, but it is worthwhile. By exposing code to review and comment early in its life, a developer can get useful feedback early in a unit of work.

Review Code Often

Another anti-pattern of code review is to look at a given proposed change only once or twice. That is sensible for a very small change, but for a complex change to a complex system it is best to perform code review repeatedly as work proceeds. For example, consider a feature that takes four weeks to implement. Our typical approach, which I recommend to others (if their context is reasonably similar to ours, see above) is something like this:

  • Review the code a few days in.
  • Review the code progress once a week or so.
  • Review the code when it is nearly ready to be called “done” and to go in the mainline.
  • If there are fixes needed, review those promptly so as not to cause delay.

Review Code Now

Code waiting for review is a drag on the speed of development; the right time to review code is now. A quick review now is probably more valuable than a deep review deferred.

Review Code as a Gate

In some organizations, code review happens asynchronously and sporadically, after code is already part of a project. This risks software quality collapse. Once code is in the mainline of a product, it is arguably too late to review. If a reviewer finds a problem with code already in a system, there will be pressure to sweep that problem under the rug and move on, or to convince your that is not really a problem, or that quality doesn’t really matter, just this once.

The right answer is simple: use code review as a gate through which code must pass before it can become part of the mainline of the project. Use your source control system, and stack up proposed code changes that are nearly ready to go into the mainline. Then periodically perform your review process (whether it is in person, remote, synchronous, asynchronous, etc.). Once the code passes a final review process, then it goes in the mainline. If there are serious problems, the code sits in a branch, a developer improves it, and it tries again next time.

We have had excellent results with this approach, maintaining software quality for years on end even through ongoing development team turnover.

 

Update Your Obsolete Packages

A Great Solution…

Maven, Leiningen, Nuget, Gradle, NPM, and numerous other package/dependency management tools are very helpful for modern (or perhaps post-modern) development, which typically involves numerous library dependencies.

These tools implement a fundamentally good and important idea:

  1. List the packages, and versions, your package/application depends on. In a text file. In the project. Where it can be diffed and merged.
  2. Run a command, all the libraries are all fetched and made available.

All of the above tools default to fetching from open source software repositories. Some or all of them can be easily configured to perform the same job with internal, closed-source repositories if needed.

All of the above tools are a large improvement over the bad old days, when adding a library meant a manual, recursive search of the internet for transitive dependencies.

… Leads to a New Problem

These tools make it so easy to “lock in” specific library versions, that projects can very easily fall far behind the current release versions of those libraries. To avoid this in our projects, a few times per year we upgrade all the libraries (timed to avoid doing it right before any important release dates).

I’ve seen this done by hand, looking up the current version of each library – and it is very tedious. Instead, a package/dependency manager ought to have an easy way to update versions. Sadly, as far as I know none of them have such a thing built in. Here are the addon tools I’ve found so far:

NPM

Use npm-check-updates. The built in “npm obsolete” sounds like it might do the right thing, but it doesn’t.

npm-check-updates -u

Leiningen

Use lein-ancient.

lein ancient upgrade

Maven

The Versions Plugin does the job.

mvn versions:use-latest-releases

Ruby

gem outdated

or

bundle outdated

Bower

There are numerous Stack Overflow questions asking for this functionality, but it is not present. To some extent, “bower list” will show you packages for which newer versions are available, then you can manually update them in your bower.json file.

Cocoapods

pod outdated

Others?

If anyone knows of similar tools for other dependency managers, I’ll be happy to add them to this list.