We have all seen the job postings. “Immediate hire… ETL developer with 10 years’ experience.” Regardless of fast changing technologies there will be a need for data integration for the foreseeable future. Even the largest of big data repositories can’t automatically integrate data; not yet. Having all of your data in a repository that is not integrated may help with technical challenges, but it does little for describing the highly sought after single version of the truth. Perhaps one day artificial intelligence will be able to accurately integrate data, but for now ETL developers are in high demand.
ETL developer demand is based on organizations’ need to operate efficiently and ultimately gain a competitive advantage. When choosing how to solve data integration needs, cost, quality, and time will surely be considered. For now, I’ll focus on cost. What does an ETL developer cost? Of course the pay range varies, but to get a ballpark cost I took a survey at Payscale.com while providing inputs to simulate a mid-career SQL Server developer with a bachelor’s degree and a couple certificates. The results seem pretty believable.
So, let’s assume our fictitious ETL developer cost an organization a total of $100,000 dollars per year after including all benefits and related costs. Crunch a couple of numbers including standard vacation & 40 hour work week and that boils down to $50 per hour. I’ll compare this $50 per hour cost to other ways of solving the companies’ data integration needs.
Option 1: Full time hire = $50 per hour
Another option is to outsource the work. This often makes sense financially. While there are many flavors of outsourcing they usually fall into two buckets; onshore & offshore. First for onshore. What is the cost for an onshore consultant? That’s a tough one to boil down to a single number. I’m just going to go with experience and throw out a number. Consulting firms are going to charge upwards of $125 per hour. Independent consultants are going to be slightly cheaper on average so I’ll estimate their rate at $80 per hour. Of course, some firms will charge more and some will charge less depending on many factors. For this little simulation I’ll estimate the onshore outsource cost of a mid-career ETL developer at $100 per hour.
Option 2: Onshore Outsource = $100 per hour
Off shore ETL development takes on many flavors. Some organizations have a small presence in the US to deal with business development as well as simplify implementation logistics. Others may operate 100% offshore. Still some other organizations partner with large corporations and work in a near employee-like capacity. Again, estimating the client cost is difficult, but I think everyone would agree that offshore cost is assumed to be lower than hiring full time employees. So, we are looking at something less than $50 per hour. I’ve personally seen rates as low as $15 per hour, but $30-$50 per hour is probably more representative. Your guess is as good as mine, but I will estimate the hourly rate for offshore ETL development at $35 per hour.
Option 3: Offshore Outsource = $35 per hour
As already mentioned, cost is certainly not the only consideration. Any organization that focuses on building expertise in a particular domain will offer far more value than is represented in the hourly rate. We all know you get what you pay for. A skilled consultant that you are paying $200 per hour is generally going to deliver more value than a lowly skilled consultant that you may pay $25 per hour. For organizations that do not have full-time needs for in-house ETL development, outsourcing is an obvious solution.
So, how did I get to the $7.25 per hour job ad? Data warehousing is a well-known domain. Tools like LeapFrogBI’s Data Warehouse Automation platform, Wherescape Red, TimeXtender, and others enable organizations to do more with less, and do it faster. Much of the work we do to integrate data is repetitive and has well established best practices. This is the perfect scenario for a tool to add value. There are some areas of data warehousing that are very much customized to unique needs such as designing a target data model suited to fulfill the requirements. Loading stage, dimension, and fact tables are very seldom processes that have not already been solved many times over. The overwhelming majority of the ETL development for data warehousing follows industry standard best practices.
How much more productive can a tool make a developer? You can easily quadruple the productivity of an ETL developer by giving them the right tool and training (see graphic below). All of these tools work on a common principal. Instead of developing code, the ETL developer tells the tool what code should be developed. In data warehousing terms, a developer creates a dimension by selecting a data source, a natural key, and the like which takes only a few minutes in most cases. If that makes sense, then you can understand how tools can make one ETL developer more productive than 10 without the tool. Another way to describe this productivity gain is that a $50 per hour employee’s output is 10 x’s greater, creating an effective rate of only $5 per hour. After adding on the value of being able to rapidly deliver as well as evolve with changing business requirements, it is hard to understand why anyone would manually develop ETL for data warehousing. $7.25 is the US federal minimum wage so I had to round up.
What’s the moral of the story?
First, data integration is not going away anytime soon. Technology changes fast, but for now we still need humans to integrate data from disparate systems. Second, there are situations where direct hire, onshore outsource, and offshore outsourcing makes sense. Like most things you will get what you pay for. Focusing on the value instead of hourly rate is usually the best approach. Finally, give your resources the tools needed to be as productive as possible. Combine a skilled ETL developer with a powerful tool and you are surely going to be happy with the outcome.