Webinar Followup: Overview of Azure Data Factory V2
About 3 weeks ago I presented a free webinar for Pragmatic Works where I talked about Azure Data Factory V2. You can watch the recording here if you missed it. In this follow-up I answer all the important questions that were posted to the Q&A pannel, like those related to the return of fun facts. If you have been tuning in for Pragmatic Works webinars for a long time you will remember fun facts from way back in the day.
Here is a link to the recording of the session (free to view): http://pragmaticworks.com/Training/Details/An-Overview-of-Azure-Data-Factory-V2
Q: Can you use 3rd party tasks in Data Factory V2?
A: Not in native data factory, but that functionality is being added to the SSIS Integration Runtime. Which rolls into another question very nicely.
Q: Can we use 3rd party tasks in SSIS packages executed in ADF V2?
A: At the time of the webinar the answer was “no”. However, a few days after the webinar there were some developments on that topic. Check out this set of documentation for adding 3rd party components and drivers (i.e. Oracle and Teradata) to your SSIS runtime virtual machines. It’s not pretty right now how you do it, but we are in preview still and the experience is likely to change over time.
Q: Can we access the SSIS catalog stored procedures to execute packages from T-SQL code?
A: Yes. This documentation all references setup of the pipeline with PowerShell, but the section describing that you have to use a stored procedure activity is still valid as is the sample code embedded in the JSON.
Q: I’m waiting for the fun facts.
A: Me too, Chris. Me too. Maybe Liz can do something about that and bring back the fun facts. I bet it would boost attendance by 30% just by doing that.
Q: Why would I use anything other than SSIS in Azure?
A: Data Factory is a managed service and with that come numerous advantages. There are also a lot of integration points in ADF that aren’t there in SSIS because it is built-in the cloud and built to connect natively to all the other resources in Azure. If all you care about is moving data from SQL Server to SQL Server then sure, you can do that with SSIS but you can also do it with Data Factory and not have to worry about Visual Studio or maintaining versions of the SQL Engine and upgrading packages and all the other headaches that come from using non-service solutions. That being said, there are things that SSIS does better than ADF right now just like there are things ADF does better than SSIS. Just choose the right tool for the right job.
Q: Can an SSIS package reference on-prem data?
A: Yes. You will want to put the SSIS Integration Runtime virtual machines onto a VNet that has ExpressRoute or VPN connectivity back to your on-prem network to do that. Some details can be found here.
Q: When is Pragmatic Works going to release Task Factory for Data Factory?
A: A couple quick notes here. Pragmatic Works Software is now part of SentryOne. This happened a week after the webinar. For clarity, this is only the software portion of the company. Pragmatic Works still has training and consulting services; nothing changes with those services. That being said here are some details on using Task Factory in ADF. Head over to the SentryOne site to keep up with all the latest on Task Factory going forward.
Q: Dang, you are making me feel old.
A: I’m sorry. I’ll be old soon too, so don’t feel bad. I hear you get to eat lots of pudding without being judged, is that true?
Q: Can we load files that have dynamic file names?
A: Yes, many places in ADF can be parametrized. There is a button under the different options that says “Dynamic” and clicking that will allow parameterization for dynamic file names.
Q: How is ADF’s copy different from things like ADL Copy and AZ Copy?
A: ADF will allow you to copy data from files and databases to files and databases. ADL Copy and AZ Copy are just for file movement (for example, CSV local to CSV in blob storage).
Q: WOW, V2 looks great!
A: It sure does! It is what everyone thought V1 was going to be. I highly recommend trying it out. It won’t be for everyone, but if you are heavily invested in the cloud or looking to invest heavily in the cloud it is definitely worth learning. It is truly the cloud version of SSIS.
Q: Do you have to use the SSIS Catalog to use the SSIS Integration Runtime?
A: Yes, that is still used for storage of the SSIS packages as it is in SQL Server today. Good news is that the database will end up in Azure SQL DB or SQL Managed Instance, so you don’t have to worry about doing DBA tasks on the database like backups; that’s all handled for you.