March 5, 2017

Python is the language of choice for APIs

“Software is eating the world” is a common refrain in Silicon Valley. When you dig a little deeper, you find that that APIs make that happen. API stands for application programming interface. APIs are how software talks to other software. APIs are the special sauce that enables you to push button on your phone to quickly summon a ridesharing service to your precise location.


Why is Python language of choice for APIs? Readable. Python is designed to be easily read. Not only the built-in parts of the language are human readable, there are also strong cultural norms to have Python code be understood by everyone at first glance. Thus, there is more time spent on "what" the code is trying to than on "how" the code is written.

Another of reason is “Low bar & high ceiling.” It is easy to be productive quickly, but the language is not inherently limiting. You can write your first Python program after a couple of hours. People can be proficient in Python within a week or two. Yet, it would take a lifetime to fully understand The Zen of Python. Python is often used to write code quickly to test new ideas but that same code can be scaled to run large systems.

In particular, Python is the dominant language for data-centric APIs. The two packages that have swept through the Data Science community are Apache Spark and Tensorflow. Spark is a Big Data processing package and TensorFlow is a Deep Learning package. Both packages have APIs in several languages but Python is common to both. However, neither package is natively implemented in Python! Apache Spark is implemented in Scala; Tensorflow is implemented in C++. Python is just a friendly user interface that manages the underlying complexity.

Python allows Data Scientist to focus on their work, not on the implement details.