Deep Dive

SQL Is The Lingua Franca Of Data Practitioners. How Can We Make It Better Understood?

SQL Lingua Franca

The world has seen the rise and fall of linguas francas spoken in geographic regions, and among practitioners in different professional fields. English is still dominant language in most Commonwealth member countries as well as in academic circles that publish mainly in English, the same way that Latin was used in the ancient Roman Empire, and Aramaic was used as Western Asia’s common language. The same applies to certain professions that use a common language to communicate, such as the Italian language that is still used by classical musicians to indicate how to interpret a music piece with words such as “crescendo” (growing), and “dolce” (sweet). For data practitioners (e.g. data analysts, BI professionals, data scientists) the lingua franca would be SQL.

Developed back in the 1970s, Standard Query Language (SQL) is still widely used by data practitioners, due the high popularity of relational databases, and despite the existence of non-relational data structures (i.e. NoSQL). It’s relatively easy to organize, manage and retrieve data when it comes in this tabular format, and SQL statements is the language used to do all of that. The usage might vary among practitioners, and while DBAs and web-developers use it in order to build and manage a database, the reporting analyst could simply build and automate reports in SQL for business users, and the data scientist might use it in order to extract data and build statistical models around it.
The field of web analytics opened up a lot more opportunities for SQL-proficient people beyond the traditional tools offered by Microsoft, Oracle, and SAP for BI, and nowadays, it’s not uncommon to hear about Google Bigquery, New Relic (NRQL), and even self-service BI such as Looker (LookML) that all developed their SQL-like languages in order to query their apps database in the cloud.

While the basic syntax of SQL is well-understood and can be learned in many courses online and in class, It’s important to understand that this basic syntax is equivalent to the basic grammar that every spoken and programming has. Much like the Iceberg Model that helps explaining what people really know about foreign cultures when they first try to envision them, once one starts working in the field they realize that many SQL “dialects” exist due to differences in business structure and culture, and I’m not talking about the difference in syntax from one tool to another, but more about an organization specific dialect. The factor that determines whether an analyst/developer’s job look more like heaven or hell is the documentation of business rules behind the reporting operations. At the end of the day , reporting or data extraction is done in order to achieve a business goal and if your organization does not document the business rules properly , such as which table should be joined and when, what certain codes in certain columns in table X mean, etc. it will be very hard for new analysts/ developers in the organization to catch up and make a progress, especially analysts who work on large databases or work in organizations that have no data warehouse.

In conclusion, I'd say that the ball is in your court. If you want your data professionals to become the organization’s data experts you have to document as many business rules as possible, including logic to identify certain actions (e.g. email opt-in, first transaction, returning customer, etc.). Even if your organization has the experts working for you now but nothing is documented, the knowledge will disappear once they decide to move on to another position. If you have proper documentation, it will be easier for you to train and onboard people quickly in case you need to, and in today’s world, where data is the new oil, you might need to hire more data professionals sooner or later. One thing for sure? The written words last, even when the experts move on.