Search code examples
azureazure-synapse

Azure Synapse Workspace: Where the scripts are published?


In an Azure Synapse workspace, there are two options (shown in red below) for publishing your content. Question: Where the content/scripts are published, and how can we access them after publishing?

enter image description here


Solution

  • I find this to be one of the more confusing topics in Synapse. It also applies to Azure Data Factory (ADF). The short answer to your question is that it gets Published to the Live Synapse service. The longer version is below.

    Azure Synapse has two modes: Synapse Live and (optionally) Git connected.

    Live mode

    Live mode is the "production" version. It contains all the artifacts (scripts, notebooks, pipelines, and others) that can be accessed by your users (assuming proper security, etc.) It is also what surfaces artifacts that can be executed externally, like Pipelines. When you execute a pipeline externally (say from a Logic App), it is the Live version that executes. [again, same in ADF]

    Whether you work in the workspace directly (as your image implies) or in Git branches (more on this below), you can think of those as "development" versions. "Publish" promotes the artifacts from development to production.

    In Live mode, the ONLY way to save the artifacts is to Publish, so in a way you are working directly in Production: your saved version is ALWAYS the Published version. For any real work involving teams, this can be troublesome. It is highly recommended that you connect your Workspace to a Git repository.

    Git mode

    When your workspace is connected to Git, you work in a branch. By default, this will most likely be the "main" branch. The main branch is your trunk, and you can only Publish from main. But you can work for a very long time in main without ever publishing, so it really becomes a true development environment.

    In Git mode, you Commit (save) your artifact changes to your Git branch. At some point in the future, when you are ready to move the artifacts to production, then you Publish main. Publishing in this case updates a separate branch in Git typically named "adf_publish". This is a branch you should basically never touch or try to work in directly as I'm pretty sure it contains some Synapse specific items. [It is a personal wish list item for me to be able to auto-publish whenever main gets updated.]

    Some Git advice: if you have a Team of people (meaning more than 1) working in the workspace, you should set up your Git repository to ban commit to main. [In fact, even if it IS just you, I would do it this way regardless]. Individuals should always work from a different branch and use Pull Requests to merge code back into main. I can tell you from experience that multiple people working directly in main makes it possible to screw up your repo to the point it won't Publish, which is no fun to correct.

    Back to Live mode

    Even when you are Git connected, Live mode is still present. You can always switch back to it from the drop down. When you do, it is like a protected mode, because while you can write & execute scripts and notebooks, you can't save them to the Workspace. You can also have users that may only operate in Live mode, so they are consumers but not creators. When in Live mode, you will not be able to see or interact with the Git repo or branches. When you are ready to edit again, you can use the drop down to easily go back to Git mode.