What does it mean to "program to an interface"?

One day, a junior programmer was instructed by his boss to write an application to analyze business data and condense it all in pretty reports with metrics, graphs and all that stuff. The boss gave him an XML file with the remark "here's some example business data".

The programmer started coding. A few weeks later he felt that the metrics and graphs and stuff were pretty enough to satisfy the boss, and he presented his work. "That's great" said the boss, "but can it also show business data from this SQL database we have?".

The programmer went back to coding. There was code for reading business data from XML sprinkled throughout his application. He rewrote all those snippets, wrapping them with an "if" condition:

if (dataType == "XML")
{
   ... read a piece of XML data ...
}
else
{
   .. query something from the SQL database ...
}

When presented with the new iteration of the software, the boss replied: "That's great, but can it also report on business data from this web service?" Remembering all those tedious if statements he would have to rewrite AGAIN, the programmer became enraged. "First xml, then SQL, now web services! What is the REAL source of business data?"

The boss replied: "Anything that can provide it"

At that moment, the programmer was enlightened.


You are probably looking for something like this:

public static void main(String... args) {
  // do this - declare the variable to be of type Set, which is an interface
  Set buddies = new HashSet();

  // don't do this - you declare the variable to have a fixed type
  HashSet buddies2 = new HashSet();
}

Why is it considered good to do it the first way? Let's say later on you decide you need to use a different data structure, say a LinkedHashSet, in order to take advantage of the LinkedHashSet's functionality. The code has to be changed like so:

public static void main(String... args) {
  // do this - declare the variable to be of type Set, which is an interface
  Set buddies = new LinkedHashSet();  // <- change the constructor call

  // don't do this - you declare the variable to have a fixed type
  // this you have to change both the variable type and the constructor call
  // HashSet buddies2 = new HashSet();  // old version
  LinkedHashSet buddies2 = new LinkedHashSet();
 }

This doesn't seem so bad, right? But what if you wrote getters the same way?

public HashSet getBuddies() {
  return buddies;
}

This would have to be changed, too!

public LinkedHashSet getBuddies() {
  return buddies;
}

Hopefully you see, even with a small program like this you have far-reaching implications on what you declare the type of the variable to be. With objects going back and forth so much it definitely helps make the program easier to code and maintain if you just rely on a variable being declared as an interface, not as a specific implementation of that interface (in this case, declare it to be a Set, not a LinkedHashSet or whatever). It can be just this:

public Set getBuddies() {
  return buddies;
}

There's another benefit too, in that (well at least for me) the difference helps me design a program better. But hopefully my examples give you some idea... hope it helps.


My initial read of that statement is very different than any answer I've read yet. I agree with all the people that say using interface types for your method params, etc are very important, but that's not what this statement means to me.

My take is that it's telling you to write code that only depends on what the interface (in this case, I'm using "interface" to mean exposed methods of either a class or interface type) you're using says it does in the documentation. This is the opposite of writing code that depends on the implementation details of the functions you're calling. You should treat all function calls as black boxes (you can make exceptions to this if both functions are methods of the same class, but ideally it is maintained at all times).

Example: suppose there is a Screen class that has Draw(image) and Clear() methods on it. The documentation says something like "the draw method draws the specified image on the screen" and "the clear method clears the screen". If you wanted to display images sequentially, the correct way to do so would be to repeatedly call Clear() followed by Draw(). That would be coding to the interface. If you're coding to the implementation, you might do something like only calling the Draw() method because you know from looking at the implementation of Draw() that it internally calls Clear() before doing any drawing. This is bad because you're now dependent on implementation details that you can't know from looking at the exposed interface.

I look forward to seeing if anyone else shares this interpretation of the phrase in the OP's question, or if I'm entirely off base...