blog.AliStoops.com

My "Fab 5" Takeaways from FabCon Europe 2025

Ali Stoops — Sat, 11 Oct 2025 20:06:01 GMT

Intro & Context

From September 15th - 18th I attended the 2025 European Fabric Community Conference (or FabCon) in Vienna. The first day has an option of attending hands on tutorials run by the Microsoft product team or a Partner Pre-Day, with the remaining three days kicking off with a keynote followed by at least 4 slots to attend one of a number of breakout sessions. The format was similar to last year, but there were a few positive improvements; some new community events including (games, meetups, giveaways), a Real-Time Intelligence booth with a live demo connected to racing simulators running Forza Horizon, and a broader range of sessions including focused "Corenotes" (focused deep dives into specific products and features) and more technical L300/L400 sessions. It's also worth calling out that this year I attended the Partner-Pre Day rather than the tutorial day and I can honestly say that both are worth the additional day's time.

Community Spotlight

This time round, I knew what to expect from a “community conference” but there were a few callouts throughout the event about the growing partner, developer, and customer community that warranted mention here.

There were a few "Fabric in numbers" updates reiterating the platform's growth, backed by the growth also in attendees from 3,300 last year to more than 4,200 attendees. The Fabric partner community is growing massively, with more than 25,000 Fabric customers (>50% using 3 or more workloads, 80% of FTSE500 companies), 135,000 partner consultants Fabric trained, 6,000+ partners implementing Fabric (250 Fabric Featured Partners), and 17,000 Fabric projects being implemented through partners.

I commented last year on FabCon Europe's community elements offering an opportunity to engage directly with Microsoft product teams and leadership, but the additions of LT AMA sessions really amplified this. Getting an hour to ask some targeted questions, call out technical or adoptions issues, and to get a sense on issues being one-off vs shared with others in attendance as live feedback was helpful, and it did feel as though the Microsoft teams were quickly triaging questions to consider where workarounds or best practices could help vs. acknowledging those shared issues and taking feedback on board to (hopefully) include in development roadmaps.

As always, it was amazing spending time networking and engaging with the community from all sorts of backgrounds, countries, and industries be it to just get to know some new people, understanding what other people are working on, or trying to share knowledge. So much value came from the conversations between or outside of the sessions. Last year we had 7 people show up to a community meet up of the Fabric sub reddit... this year, it was almost 4x that

My Fab 5 Takeaways

Unsurprisingly, AI was a core topic throughout the conference, and called out in the vast majority of sessions. A few things were interesting to me on this; there was a a great variety of use cases and components (UDFs, Fabric agents, Foundry integration, Copilot) on display, demos had a great hands-on practical focus, AI security got quite a bit of attention, and there was some specific technical guidance around best practice ranging from semantic model readiness to tips for scaling CosmosDB in production. It’s been really nice seeing the conversation and AI demos evolve during recent events (multilayered agents, a computer use agent for submitting support tickets, combining fabric with foundry, practical adoption advice) and something that I think we will continue to see improve customer adoption.
Focus on Going Generally Available (GA) - before attending in 2024, I wrote a blog where I had hoped for some preview features to move to general availability, and I saw a similar sentiment in a number of forum threads over recent months. I think there was a general maturing of the Fabric platform recently which was reflected in a variety of GA announcements at FabCon including the OneLake Catalog Govern tab, Python Notebooks, User Data Functions, Variable Library, Data Warehouse Migration Assistant, various pipeline activities, the VSCode extension, and perhaps the most anticipated announcement of 100% coverage of Fabric assets in CI/CD support.
Agents aren't going anywhere, but how we approach development processes and governance will clearly play a big part in their value add (or lack thereof). I think the development options can be a little overwhelming for new agent developers (Fabric, Foundry, Copilot Studio, etc.) and understanding best practices across each will be difficult, but the implications of design decisions like the ability to properly version control agents, restrictions in user and agent accesses, and mechanisms for monitoring both use and performance suggest that the governance around how we enable developers and users and defining standards for architecting agentic solutions will be increasingly important. I do expect Microsoft will be working on some new or additional tooling to address this, but for now it's really down to collaboration / knowledge sharing and experimentation.
Copilot in Power BI got range of updates covering Power BI apps, AI data schemas for copilot enabled semantic models, and the ability to give specific guidance through verified answers and AI instructions that should massively improve accuracy for end users. Combine this with the Fabric data agent updates, Fabric data agent and "Chat With Your Data" MCP Server, Foundry + Fabric data agent integration, and various Teams integrations, I think this is the first time I've really seen the direction of travel align with M365 Copilot being "your UI for AI." I think there's some time before this is a reality for most businesses, but the pieces are coming together nicely.
There was a real focus on developer experience. With a quoted 7 million developers or creators and more than 20 million semantic models globally, it's not surprising that the developer experience has been an area of focus. This includes Fabric CLI going open source, a new and improved UI, web authoring getting significant updates to be closer to the Power BI desktop experience, and a very interesting evolution of the workload development kit now called the Fabric extensibility toolkit for developing Fabric apps. I also noted a number of quality of life improvements related to both functionality and performance with my favorite one perhaps being cost/consumption for Gen2 Dataflows (25-90% improvements), but if I'm being critical I do think it would be great to see similar improvements addressing the biggest challenges developers are dealing with (e.g. monitoring).

Other feature announcements & Thoughts

The September update blog is worth a read if you have the time, and there's no doubt more updates coming at Ignite in November.

Above I mentioned AI security getting a significant amount of attention. It's worth mentioning that there was a quoted 56% increase of AI related cybersecurity incidents last year. The monthly feature updates included lots of things in this vein including OneLake security (too many to list, coming soon), outbound data protection, Purview data security for AI, all of which felt aligned not just to increase controls and risk mitigation, but also to define once and apply everywhere (fabric, copilot, even excel).

In line with my Fab 5 takeaways, also think it's worth calling out that agents got 5 mentions in the 2024 September update blog (all from one callout) compared to the 43 in September 2025's blog.

I promise I will do better following FabCon 2026 to get my debrief blog written up a little earlier! There's plenty more to come, no doubt, at Ignite in November, but I hope to see many new and familiar faces in Barcelona next year at what has become my favorite event of the year.

Azure AI Engineer Associate (AI-102) Certification - My Experience & Recommendations

Ali Stoops — Thu, 07 Aug 2025 12:56:34 GMT

TLDR

The Azure AI Engineer Associate certification is intended to cover a wide range of AI/ML functionality available on Azure, but with a focus on Generative AI (LLMs, agents). Given that I saw the Microsoft Learn modules change during my preparation, and the rate of change in the world of AI, I would expect the training collateral and exam to continue to change in the near future. As a result, I would suggest that anyone undertaking the exam doesn't draw out the preparation period, and that those who pass the exam occasionally revisit the learning pathway to review any new material.

Topics covered are focused on core Azure AI services, pro-code development, deployment & integration, and monitoring with less content on solution design or performance. Though Python or C# knowledge is required, this isn't especially broad and those with intermediate skills in either should have no problems as the core requirement is specific to programmatically interacting with Azure AI services and no more.

Though I think the certification is well aligned for AI practitioners working on Azure, I believe that any engineers or solution architects interested in, or needing to prepare for, working with Azure AI services would benefit from undertaking the AI-102 exam.

Who is the AI Engineer Certification Aimed At?

The AI-102 exam targets professionals designing and deploying enterprise AI solutions on Azure. Obviously, this is targeted at aspiring AI Engineers, but ideal candidates include (existing or aspiring):

Role	Relevance to AI-102
AI Engineers	Core audience—builders of production-grade models, RAG pipelines, and multi-agent systems.
Azure Developers	Developers integrating AI into apps via REST APIs, Python SDKs, or containerization.
Solution Architects	Designers of end-to-end AI systems, including governance, scalability, and cost optimization.
Data Scientists	Practitioners transitioning ML models to Azure, focusing on MLOps and Responsible AI.

Of course, this isn't extensive, and the above table is based on the guidance Microsoft have shared. I'd also suggest that there's good reason for data engineers and analysts to undertake the AI Engineer certification given how crucial data (quality, operations, governance etc.) is to any AI solution as well as how much overlap there is with the required skillset.
There is some value in broader audiences considering specific elements of the Microsoft Learn pathway, but it's important to remember that there is an expectation of familiarity with Azure concepts (resources, databases, storage accounts, SharePoint) and one of C# or Python.

Exam Preparation

source: https://learn.microsoft.com/en-gb/azure/ai-foundry/

Prior to undertaking the exam, the only resources I utilized were the MSLearn pathway and practice assessment, both because I have been using many of the components being assessed as part of my role, and also as I was keen to understand if these alone would be enough to pass the exam (a question I've been asked on other certs before).

In short, I do think the learning pathway is enough preparation on it's own, but only if you pay particular attention to the labs/tutorials or have experience building AI solutions on Azure, and spend some time familiarizing yourself with MSLearn AI documentation. I think the better option is to also consider a couple of additional resources such as John Savill's study cram and Microsoft's AI Engineer Applied skills.

Resources:

My Experience and Recommendations

source: https://ai.azure.com

Master Microsoft Learn. I've called MSLearn out in previous certification blogs as you are able to open an MSLearn window in the exam but you can't use it consistently due to time constraints. However, this is the first exam that, for one reason or another, I completely forgot about using MSLearn and there are some questions that you're not likely to remember specific detail for, such as those around service-specific limitations (e.g., Document Intelligence’s 500-page PDF cap) and pricing tiers.
Understanding multi-modal processes is important. I don't mean this specific to multi-modal GenAI models, but more around considering how to design AI solutions where both image (or document) and text analyses are required. Having clarity on the order of steps, services, and inputs vs. outputs will be beneficial throughout the exam. Utilizing AI documentation and even drawing out solutions on paper / whiteboard / tablet during exam preparation can be helpful.
Content safety was a popular component. I would suggest that content safety appeared in 15-20% of questions, far more than I had expected. Prepare to be comfortable around blocklists for user-generated content, interpreting harm severity scores & harm detection, and Azure AI Content Safety workflows.
Agents might be "new" but they are pivotal. Despite being new, I saw a number of scenario-based questions focused on Semantic Kernel and multi-agent orchestration. Even for those with existing AI Engineering expertise, it's not likely that you'll be as familiar in these areas as some others.
Focus coding prep on integration skills. The pre-requisites require Python or C# skills, which will help, but I don't think advanced coding skills are at all required for the exam, and no questions ask you to code but, rather, interpret (pick from a multiple choice selection). I would suggest focusing on configuring REST APIs, SDK authentication, and error handling—not syntax. Make sure you're familiar with identifying correct SDK methods (e.g.,analyze_image() vs. detect_objects()) and troubleshooting failed API calls.
Take a balanced approach to ML service coverage. No single Azure AI service dominated; expect equal emphasis on vision, NLP, search, document intelligence, and generative AI.
Scenarios can be tricky: Scenarios like “Design a multi-agent system to resolve IT tickets” required understanding agent roles, fallback workflows, and human-in-the-loop patterns. Ultimately, I believe the best prep here is to undertake as many tutorial-style modules as possible as well as typical exam prep around question format (identifying keywords). Prior to actually working through tutorials or applied skills, a good habit that can help learning for scenario questions is to write down your order of operations (3-6 steps) and validate as you progress.
Knowing out of the box functionality is important. Of course entity extraction and document intelligence are likely to come up quite a bit during the exam, but it's really helpful to know things like what is covered under entity categories and prebuilt document intelligence models.
Of all the AI/ML services, speech services is not likely to be focused on. Understandably, digging in to speech services often means including things that require audio (speech to text or vice versa) which isn't going to be viable in the current exam format. It's more likely that questions in this space cover things like making API calls or general solution design.
Exam technique & trap questions: I didn't really see any true "trap" questions, but identifying keywords and knowing services inside and out is important. For example “Use Azure Bot Service for agent orchestration” was something I observed, but seeing the question highlight orchestration would point towards Semantic Kernel without spending time considering options in detail.
Learning collateral is great, but nothing beats true hands-on experience. In my opinion, it's much easier learning by attaching to real-world use cases, even if they're made up. In my case, I designed a Fabric-integrated agent to analyze F1 telemetry data and generate pit-stop recommendations.

Note: If you've not undertaken an Azure exam before, it's worth doing a little reading on the scenario questions as they're not covered in the practice assessment. Usually you are presented with context and background in a format like a high level overview, required solution, then business and technical requirements. You get asked a series of questions, but once answered, you cannot return to change your answers. The biggest tip here is to try and minimize time wasted. In my opinion, that time is better spent on other questions where you're more likely to pick up marks or be certain on answers.

Understanding Power BI vs. Microsoft Fabric

Ali Stoops — Fri, 13 Jun 2025 21:26:15 GMT

Intro & Context

Over the last 12 months, I’ve supported a number of customers in their planning and implementation of Microsoft Fabric. In almost all scenarios, the first conversation or part of the first set of conversations is around understanding the differences and overlap between Power BI and Microsoft Fabric. I’ve been involved in a number of conversations around the merit or lack of merit in potentially splitting the two products. As a result, I wanted to share the information that usually ends up supporting a key solution decision about whether the required functionality exists as part of the Power BI Pro licensing or if a Fabric (F SKU) capacity would be required.

A Brief History of Power BI Premium

Before getting in to the licensing and functionality conversation, it’s important to understand why there has been some risk of confusion for existing Power BI customers. This comes down to a few primary things:

Until January 1st 2025, it was possible to purchase a Power BI Premium (P) SKU. For those who are using a P SKU right now, they will have access to Fabric functionality as the feature set is almost identical
Power BI Premium (P) SKUs can no longer be purchased by new customers, but will be in use and available to any existing customers until the end of their existing enterprise agreement plus the agreed grace period, but content must be migrated to a Fabric SKU thereafter
Power BI Premium has two licensing models; premium capacities and premium per user (PPU) licensing, the latter of which was made available in April 2021. Both offer extensions to Pro functionality such as XMLA endpoints, AI features, and deployment pipelines - see more details in the comparison on the Microsoft documentation. PPU is sticking around for the foreseeable future, and while it is fundamentally different from the capacity model, most customers I’ve worked with make the decision to either utilise Pro licensing or premium (now Fabric) capacities. Typically, the biggest drivers to choose Premium Per User are where there’s a small user base but one or more key features in 48 refreshes per day, larger model size limits, or deployment pipelines are required.

source: https://learn.microsoft.com/en-us/power-bi/developer/embedded/embedded-capacity

For the purpose of this blog, I've only focused on Fabric SKUs, with a brief mention of Power BI Premium SKUs above but, as highlighted here, EM and A SKUs used to be the primary mechanism for embedding dashboards internally and for customer scenarios. This functionality is possible today with Fabric F SKUs, so I've excluded any further details.

Licensing

https://www.microsoft.com/en-us/power-platform/products/power-bi/pricing#tabs-pill-bar-oca31b12_tab0

Power BI and Microsoft Fabric have different licensing models, but it’s worth highlighting up front that Fabric is a superset of Power BI functionality (i.e. all Fabric capacities will have full Power BI functionality). Power BI is primarily a reporting tool available as standalone SaaS, while Microsoft Fabric offers a unified analytics platform with various capacities (F SKU). The licensing for Power BI Pro includes basic features for collaboration on reports, dashboards, and scorecards. In contrast, Fabric capacities provide additional elements such as data engineering, data factory experiences, and AI/ML capabilities as well as OneLake and more.

As for pricing, it can get quite complex depending on your specific scenario. The basic components are that Power BI Pro licenses are (as at the time of writing) $14 per user per month, Premium Per User is $24 per user per month, and Fabric capacity pricing depends on SKU size - more details are available through the Microsoft website and SKU Estimator. Where some complexity lies is in the total cost related to balancing requirements around mixed (Pro and Fabric workspaces) environments, workload segregation and cost management, as well as the the fact that Fabric capacities under F64 still require Pro licenses for all report developers and consumers whereas F64+ includes licenses for viewers.

Note: Power BI Desktop, the application for developing Power BI dashboards locally, is free to use. Licensing is only required for publishing to a workspace.

Functionality

source: https://www.microsoft.com/en-us/power-platform/products/power-bi

Power BI Pro:

ETL Development: Power BI Pro enables the use of Power Query through PBI Desktop and Gen1 dataflows. Gen1 Dataflows allow direct query, sourcing data, and incremental refresh but lacks advanced features like publishing to a specific data destination, improved monitoring, integration with data pipelines, and high-scale compute (see more here).
Data Access - Connectors prebuilt for a range of source systems, both transactional source systems or existing data warehouses.
Semantic Models - Create and manage semantic models (datasets) that unify business logic, calculations, and definitions to ensure consistency across reports. This includes support for imported (in-memory) and DirectQuery modes for flexible data access and performance . These can be refreshed up to 8 times per day.
Storage - Each Power BI Pro user receives up to 10 GB of cloud storage for reports, datasets, and dashboards. Each dataset is limited to 1 GB after compression per semantic model.

Microsoft Fabric:

ETL Development: Fabric supports data engineering and data factory experiences with SQL, R, PySpark, and Realtime intelligence (KQL).
Storage: Lake-centric architecture using OneLake as a unified storage layer for structured and unstructured data, supporting open formats like Delta Lake and Parquet. OneLake provides a unified data store enabling discoverability and endorsement of data assets. Fabric also supports data warehouses, lakehouses, eventhouses, and more.
Copilot: Available for F64 Fabric capacities and above until earlier this year, now available for all F SKUs. Often, Copilot for Power BI is highlighted for interacting with Power BI reports, but Copilot in Fabric also facilitates accelerated development of reports as well as data engineering workloads
Lineage and Governance: lineage itself isn’t unique to Fabric, but the nature of additional features covering ETL processes definitely improves the value proposition. Additionally, enhanced monitoring, and OneLake Catalog governance (screenshot above).

CI/CD and Orchestration: For CI/CD, deployment pipelines, Git integration, and the Fabric-CI-CD library all offer options for managing your software development lifecycle. Meanwhile, job scheduling and triggering can be configured through various orchestration tools including data pipelines, spark jobs, and airflow.
AI/ML Functionality: Fabric offers native AI/ML functionality with the ability not only to develop models through Python and R, but also to create ML experiments and models as manageable artifacts. Data Agents in Fabric and Auto ML functions build on this through the ability to rapidly provision agents and execute a number of generative AI functions in a single line of code, and for anything custom there’s also the option for Integration with Azure AI Services.

Semantic Models: Direct Lake storage mode is available through F SKUs, high-performance, low-latency analytics directly on large datasets stored in OneLake, with no need for scheduled refreshes, as are large semantic models (>10GB up to 100GB). Query caching and configuration for optimizing read performance.

I’ve only described some core functionality above, so it’s not exhaustive and doesn’t include things like the reporting and collaboration features core to Power BI, nor does it include networking, where the use of Data Gateways, private endpoints and trusted workspaces are commonplace for any Power BI deployment.

What about on-prem and local development?

For on-premises and local development, there are a few things worth calling out. First, both Power BI and Microsoft Fabric utilise the same mechanism for accessing on-premises data - on-premise data gateways mentioned above. Where these are used, the analytical assets still exist in the Azure cloud. For on-premises deployment, Power BI Report Server is available and requires Fabric F64+ reserved instances, SQL Server Enterprise Edition with Software Assurance, or SQL Server Enterprise Subscriptions.

As for local development, Power BI Desktop as well as various 3rd party external tools are often commonplace and, in my opinion, still a more comfortable experience than developing wholly through the web GUI. For Microsoft Fabric there are also VSCode Extensions for workspace management and creating user data functions as well as Fabric Data Engineering.

Can I Combine a Fabric and Power BI Pro-backed Workspaces?

The short answer is yes. It’s not as straightforward as just “do” or “don’t” in the sense that the balance depends on your organizational ways of working, but it’s worth mentioning this as an option and I think that it should be considered as a great strategy for workload or capacity management. Workspaces are attached to a Fabric capacity as needed, but if you have a team / domain / function who only use Power BI, do not need enterprise level ETL capability, or are primarily going to connect to centralised semantic models that exist in separate workspaces, it’s entirely reasonable to utilise both Fabric premium attached workspaces and professional-only workspaces.

Conclusion

While Power BI Pro is a great, and mature, reporting tool suitable a range of BI needs, and is available standalone or as a workload within Microsoft Fabric. Fabric offers a more comprehensive solution for organizations requiring advanced data engineering, storage, governance, and AI capabilities and is ideal for organizations seeking an all-in-one solution for the entire data lifecycle. The choice between Power BI Pro and Microsoft Fabric depends on your specific needs but, in my experience, more and more people asking if they need Fabric (not “just” Power BI) are seeing many potential benefits come from either the set of available experiences and functionality, or bringing data processing, storage, and transformation closer to the downstream reporting. Where this becomes more complex is often where a mature data platform is already in place, whereby there are a range of factors that might input to a decision around use of Power BI (pro and premium) or Fabric.

Microsoft Fabric Capacity Sizing & Estimation

Ali Stoops — Fri, 18 Apr 2025 17:43:13 GMT

Intro & Context

Until recently, the only easily accessible information around capacity planning in Microsoft Fabric was the pricing details outlined on the Microsoft website, but this has resulted in difficulty in anticipating the appropriate SKU requirements. The most common guidance I've seen around capacity planning for Microsoft Fabric is usually to make use of the 60-day free trial, run some realistic workloads, and monitor consumption through the Fabric Capacity Monitoring App (or Fabric Unified Admin Monitoring). In fact, as of the time of writing, this was the guidance provided:

Earlier this month, Microsoft released the Fabric Capacity Estimator in public preview to support potential customers with this challenge.

My experience with the new Capacity Estimator

I was fortunate enough get access to the capacity estimator during private preview after attending FabCon Europe in September 2024 so I've been able to use it during implementation planning for a 2 new and 1 ongoing Fabric projects as well as a few checks against existing capacities. Not much has changed in the front end since the private preview - as far as I can tell, the only differences are a new (optional) "Power BI model size (Gb)" input and some "cost recommendations" that come up on occasion, but there would likely have been some tweaks to the backend calculations.

In my experience, the estimator has been reasonably accurate as long as some care is taken in the input estimation, so before sharing a few recommendations, it's worth saying that I think it's a good addition to the documentation and tooling Microsoft offer. That said, I believe the capacity estimator is most useful for those utilizing engineering tools including Dataflow Gen2 and Data Pipelines.

For those running pure pro-code experiences (mainly notebooks), I think it's slightly trickier as most consumption is based on the data warehouse and Power BI components, with Spark Jobs not being adjustable and no "number of hours notebook execution" input. I understand this would be a little trickier as it’s not typically one-size-fits-all i.e. depends not just on run time but cluster sizing and environment setup (e.g. native execution engine).

Spark jobs showing only 3% in this case

Based on what I've seen, I would suggest:

Though all variables will of course have an effect on the output, I've found that more weight is applied to data factory (hours for daily Dataflow Gen2 ETL operations), data warehouse (data size), and data science (training jobs), so be as specific as possible with these inputs
- For the number of hours of Dataflow Gen2 ETL operations, consider this as the total number of active hours across all workloads but exclude parallelization (i.e. you might run 4 hours but it's two pipelines for 2 hours in parallel. Use the 4 number here)
- Data size is compressed. This is not an easy one size fits all, but I usually put around 60% of the overall data size when moving from SQL databases due to the parquet storage when writing to Fabric
Consider Data Warehouse as "Data Analytics Store" - there isn't a data lakehouse option in the estimator
When considering Total Cost of Ownership (TCO) calculations, the capacity estimator wouldn't help with licensing and data storage costs, so remember to factor these in. for licensing, be sure to remember that consumption for report viewers is included in F64 SKU costs. As a result, the total cost is
- C (cost) = W (SKU cost) + X (Power BI report consumer licensing) + Y (Power BI report developer licensing) + Z (storage cost)
- Depending on the number of report developers and viewers, it may be the case that the cost of running an F32 capacity is the same or more as running an F64, so calculate the numbers after you have a SKU estimation
Consider if all your assets would need to be backed by a Fabric SKU. Often, there is a good option of mixing Fabric-backed workspaces and Power-BI pro backed workspaces which could reduce consumption needs
If in doubt, lean towards the lower capacity. Aside from the above option in moving some assets to Power BI pro workspaces, or optimizing workloads for capacity unit consumption (e.g. notebooks rather than Dataflow Gen 2), it usually takes some time to migrate workloads to any new solution and you're unlikely to need the full capacity immediately. Additionally, I would anticipate that the cost-optimal solution for using Fabric in production is to reserve a capacity over 12 months, and unless you are sure you're going to need the larger capacity for at least 3 of those months, it makes more sense to reserve the lower capacity and increase (with pay as you go billing) if required

Any Additional Factors

A couple of things I haven't considered here are:

AI functions like data agents for Fabric or Copilot. These also aren't factored into the capacity estimator, so please do due diligence on how consumption of these would impact your capacity estimation
Autoscale billing for Spark (serverless). This will likely change some capacity planning exercises for spark workloads - it was only announced a few weeks before the capacity estimator was released and there will be plenty to learn in the near future, but I think it’s best to assume the capacity estimator is in place for standard capacity billing scenarios (for now)
Some other features will change capacity consumption. In my mind, these are mostly to do with preview features, like user data functions, and monitoring. With preview features, it makes sense as adoption will vary and consumption might change, but it’s worth considering what this might mean on an individual basis. For monitoring, it’s also going to depend on your approach, but I would consider leaving a little headroom if you implement something like Fabric Unified Admin Monitoring (FUAM)

Metadata Driven Fabric Pipelines (2 of 2) - Dynamic Pipelines & Deployment

Ali Stoops — Sat, 15 Mar 2025 00:21:55 GMT

So, you've created a wonderful pipeline in Microsoft Fabric, but everything is point-and-click or copy-and-paste when changes need made either within a single environment or after deployment to a new workspace. For anyone who wants to re-create what I'm working on up until this point, that is a sample lakehouse, warehouse, basic ingestion pipeline and dataflow, and a deployment pipeline, please see part 1 in my last post. There's some more information there around an introduction and context, but the aim of this blog is to focus on providing information around driving a Fabric pipeline through metadata, parameters, and variables, so let's get stuck in.

Creating the metadata

Our first step is to create the warehouse table we will use to drive the dynamic content used later. Before moving on to the actual table creation, it's worth opening VSCode notepad, OneNote, or wherever you store text for using later, and grab your warehouse ID and SQL connection string. For the warehouse ID, open the warehouse from you workspace and copy the text after '/warehouses' and up to, but excluding '?experience' in the url (in the red box in the image below), and for the connection string, you need to open the warehouse settings then navigate to SQL endpoint and copy the connection string. Do this for both the dev and test environments. We also want to grab a couple of other bits of information while we're here; the text after groups but before warehouses (in the blue box from the image below) is your workspace id, and lakehouse id follows the same process as warehouse id but you should see '/lakehouses/' in the URL.

Use these values to replace the 'mywarehouse...', 'workspace_id,' and 'warehouse_id' variables in the below SQL query and run it in the sample warehouse in the dev environment. I've taken the approach of only creating this table once and pointing all environments back to the table in dev, but this could just as easily be run in each environment.

	DROP TABLE IF EXISTS [Sample Warehouse].[dbo].[environmentvariables]
	CREATE TABLE [Sample Warehouse].[dbo].[environmentvariables]
	(
	    [Environment] [varchar](8000) NULL,
	    [Variable] [varchar](8000) NULL,
	    [Value] [varchar](8000) NULL,
	    [workspaceid] [varchar](8000) NULL
	)
	INSERT INTO dbo.environmentvariables ([Environment],
	    [Variable],
	    [Value],
	    [workspaceid])
	VALUES 
	    ('Dev', 'sqlstring', 'mywarehousex.datawarehouse.fabric.microsoft.com', 'workspace_id_a'),
	    ('Dev', 'warehouseid', 'warehouse_id_x', 'workspace_id_a'),
	    ('Dev', 'lakehouseid', 'lakehouse_id_x', 'workspace_id_a'),
	    ('Test', 'sqlstring', 'mywarehousey.datawarehouse.fabric.microsoft.com', 'workspace_id_b'),
	    ('Test', 'warehouseid', 'warehouse_id_y', 'workspace_id_b'),
	    ('Test', 'lakehouseid', 'lakehouse_id_x', 'workspace_id_b'),
	    ('Prod', 'sqlstring', 'mywarehousez.datawarehouse.fabric.microsoft.com', 'workspace_id_c'),
	    ('Prod', 'warehouseid', 'warehouse_id_z', 'workspace_id_c'),
	    ('Prod', 'lakehouseid', 'lakehouse_id_z', 'workspace_id_c');

Adjust the Pipeline

Now we have things set up, there are some additions to be made to the transformation pipeline, so navigate to your pipeline (starting point below).

Start by adding the relevant variables that we will populate from the environment variables. Here I've labelled them as 'sqlstring', 'warehouseid' and 'lakehouseid':

Then add a lookup activity and configure it to connect to the sample warehouse. Tick 'Enter manually' for the table details and enter dbo.environmentvariables or whatever you named the table from the above sql query if you changed it. Make sure to UNTICK the 'First row only' setting

At this point, the activity would simply read the environment variables table. The next step is to create to activity streams, each filtering to a single entry in the table to collect and set one variable (the sql string then warehouse id). First, add a filter activity, connect the lookup activity, on success, to the filter activity, and configure as below. Please note, that the two text entries in quotes need to be configured to the title of the lookup step ('EnvironmentVariableLookup' for me) and the variable name you've given the sql string in the original query ('sqlstring' for me):

Items - @activity('EnvironmentVariableLookup').output.value
Condition - @and(equals(item().workspaceid,pipeline().DataFactory), equals(item().Variable,'sqlstring'))

Next, add a 'set variable' activity, connect (on success) the filter activity to the set variable activity, and configure the settings as below:

Name - sqlstring
Value - @activity('Filter SQL String').output.value[0].Value

This is taking the array value from the column 'Value' of our environment variables filtered table and assigning it to the sqlstring variable.

This needs to be repeated for the warehouse ID and lakehouse ID. You'll notice below that it's only the text in the variable value calling back to the activity in set variable and condition to select the right variable in filter that change

Activity	Setting	Value
Warehouse ID Filter	Items	@activity('EnvironmentVariableLookup').output.value
Warehouse ID Filter	Condition	@and(equals(item().workspaceid,pipeline().DataFactory), equals(item().Variable,'warehouseid'))
Warehouse ID Set Variable	Name	warehouseid
Warehouse ID Set Variable	Value	@activity('Filter Warehouse ID').output.value[0].Value
Lakehouse ID Filter	Items	@activity('EnvironmentVariableLookup').output.value
Lakehouse ID Filter	Condition	@and(equals(item().workspaceid,pipeline().DataFactory), equals(item().Variable,'lakehouseid'))
Lakehouse ID Set Variable	Name	lakehouseid
Lakehouse ID Set Variable	Value	@activity('Filter Lakehouse ID').output.value[0].Value

After connecting these to the stored procedure, now we need to adjust the downstream steps to run based off our defined variables. Under the stored procedure settings, select the connection dropdown and then use dynamic content. Navigate to the variables section, which may need you to select the three dots on the right hand side, and click warehouseid, then 'OK' and do the same for the sql string. We then need to adjust two other inputs; first, click in the workspace ID and select use dynamic content and under system variables select Workspace ID (or just copy and paste the following '@pipeline().DataFactory'). Finally, type in the stored procedure name.

Now we need to configure the For each inner activity source and destination as below:

At this point, you can run the pipeline through and if all is configured properly, it should succeed. Usually the errors will give a good pointer where something has failed e.g. during testing I had an error where the table didn't exist because I'd cleared down my workspace without re-running ingest before testing this, but you can also check the monitor section for details and the input/output of each task in the pipeline to see where things are at.

Deploy to Another Workspace

Finally, to realise the benefit of this effort, it's worth deploying through a deployment pipeline. You'll see the connection on the lookup activity is still pointing to the dev workspace which is how I had this set up, but you can otherwise just click run and immediately populate your test workspace assuming the data is in place. If you followed part 1 of this blog, you will need to run the 'sourcepipeline' data pipeline first.

Considerations & Lessons Learned

Originally, I tried using the connection (warehouse / lakehouse) names as that's how they appear in the drop down, but you need IDs here
For the lakehouse, be careful to not use the SQL analytics endpoint, the lakehouse and SQL endpoint IDs aren't the same
I would only follow this method where the input tables are static, or you're comfortable with either recreating the 'for each' copy assistant task when you need to update it or add another one (which would be messy). Though the JSON can be edited, you need to map every column source / sink
Not truncating tables or dropping them before writing to a warehouse will populate with records twice
This was done for testing, but in practice I would only be utilising the lakehouse and warehouse if the data was transformed between each so this pattern definitely needs updated before any production use, it's just a proof of concept
I extended this by also adding a parameter that was static for the connection to the environment variables table. I would recommend doing so, but this blog felt a little on the long side already!

Metadata Driven Fabric Pipelines (1 of 2) - Setup

Ali Stoops — Thu, 13 Mar 2025 23:05:24 GMT

Intro & Context

Over the last couple of months I've been working on a few projects where the question of choosing the right components of Fabric to combine has come up. Typically, my personal preference is pro-code first, but as well as wanting to realise some of the benefits of the pre-built connectors in Fabric for extract and load processes, many users would often prefer to not be in Python or PySpark notebooks all day! The addition of CI/CD support for Gen 2 Dataflows was a step in the right direction for making them an easy recommendation, but I've noticed issues with preview versions either not being able to be invoked via pipelines or not connecting to staging lakehouses post CI/CD processing. Unfortunately, only preview versions are viable through CI/CD, so I was exploring some other options and wanted to share how I've used metadata to drive pipelines and copy activities as I've seen done for notebooks in the past.

It's worth noting that there are many examples of end to end scenarios and examples online, and this isn't to say copy tasks are the way to go, but rather to provide some guidance on paramaterising the pipeline itself.

Lastly, in order to enable people to re-create the content here, I'm splitting this into two parts. This blog is working through the initial setup of assets and structure, with the following blog, that I will link to retrospectively, covering the dynamic content and parameters.

Pre-Reqs & Notes

The only real pre-requisite here is to have a Fabric (trial or otherwise capacity), and that's it! But the asset structure includes; a Sample Warehouse, a Sample Lakehouse, a few stored procedures, and sample data from the diabetes, retail (fact_sale), and nyctaxi databases available via the copy assistant. For context, the downstream processes here include combination of Lakehouse and Warehouse purely because I was conducting some testing for a customer whereby the staging option when writing direct to a warehouse from on-premises mandated external (ADLSG2), hence we wrote to a staging lakehouse first, and we also required SQL functionality not currently supported for lakehouse SQL endpoints. In production, I would consider doing this differently as you are running two jobs rather than one, and storing two copies of data. But for this purpose, given the jobs were only a few minutes and we didn't have capacity challenges, the constraints weren't of concern.

Creating the data assets

I'm briefly going to talk through the assets I created and how it could be replicated, but it's worth noting that this is a simplified version as it wouldn't be required to replicate my end state job. First, create the blank workspaces - a Dev, Test, and Prod workspace with any relevant prefixes. Within Dev, add a new lakehouse and warehouse (I've titled these "Sample Warehouse" and "SampleLakehouse"). The ETL processes here are simple copy tasks with no transformation as it was more to prove the concept of metadata driven pipelines.

Once provisioned, I added a source data pipeline (mine is on Github here). I created this by adding copy tasks for 3 sample data sets because I wanted to create something that looked more like a real scenario in moving multiple tables at once.

Create a new data pipeline (new item, data pipeline), then add sample data. I first did this through a copy activity for sample data from the NYC Taxi data set and loaded to the sample lakehouse:

For the purposes of being able to manage this across a number of tables at once, I created copy activities to ingest the diabetes, NYC Taxi, and fact_sale from Retail data sets. Ultimately, this looks something like the below:

The next step was to create a pipeline to move this data from the lakehouse to warehouse. This started by creating another pipeline (named Transformed Pipeline), then navigating through a copy assistant task and adding tables as new, but I noticed things getting a little funny running this multiple times and record counts didn't add up. The initial solution for testing was to combine a copy task creating data as new with a truncate stored procedure then create as existing. Instead, I have created a stored procedure to create the empty warehouse tables. Be sure to run the drop table and create table code once via the warehouse (i.e. below, excluding create proc as. Alternatively, create the procedure then run it via SQL using eexec dbo.storedprocname), then create the stored procedure before creating a new pipeline:

	CREATE PROC [dbo].[sp_create_table_schemas]
	AS
	BEGIN
	DROP TABLE IF EXISTS [dbo].[fact_sale]
	DROP TABLE IF EXISTS [dbo].[nyctlc]
	DROP TABLE IF EXISTS [dbo].[diabetes]
	CREATE TABLE [dbo].[fact_sale]
	(
	    [SaleKey] [bigint] NULL,
	    [CityKey] [int] NULL,
	    [CustomerKey] [int] NULL,
	    [BillToCustomerKey] [int] NULL,
	    [StockItemKey] [int] NULL,
	    [InvoiceDateKey] [datetime2](6) NULL,
	    [DeliveryDateKey] [datetime2](6) NULL,
	    [SalespersonKey] [int] NULL,
	    [WWIInvoiceID] [int] NULL,
	    [Description] [varchar](8000) NULL,
	    [Package] [varchar](8000) NULL,
	    [Quantity] [int] NULL,
	    [UnitPrice] [decimal](18,2) NULL,
	    [TaxRate] [decimal](18,3) NULL,
	    [TotalExcludingTax] [decimal](18,2) NULL,
	    [TaxAmount] [decimal](18,2) NULL,
	    [Profit] [decimal](18,2) NULL,
	    [TotalIncludingTax] [decimal](18,2) NULL,
	    [TotalDryItems] [int] NULL,
	    [TotalChillerItems] [int] NULL,
	    [LineageKey] [int] NULL
	)
	CREATE TABLE [Sample Warehouse].[dbo].[nyctlc]
	(
	    [vendorID] [int] NULL,
	    [lpepPickupDatetime] [datetime2](6) NULL,
	    [lpepDropoffDatetime] [datetime2](6) NULL,
	    [passengerCount] [int] NULL,
	    [tripDistance] [float] NULL,
	    [puLocationId] [varchar](8000) NULL,
	    [doLocationId] [varchar](8000) NULL,
	    [pickupLongitude] [float] NULL,
	    [pickupLatitude] [float] NULL,
	    [dropoffLongitude] [float] NULL,
	    [dropoffLatitude] [float] NULL,
	    [rateCodeID] [int] NULL,
	    [storeAndFwdFlag] [varchar](8000) NULL,
	    [paymentType] [int] NULL,
	    [fareAmount] [float] NULL,
	    [extra] [float] NULL,
	    [mtaTax] [float] NULL,
	    [improvementSurcharge] [varchar](8000) NULL,
	    [tipAmount] [float] NULL,
	    [tollsAmount] [float] NULL,
	    [ehailFee] [float] NULL,
	    [totalAmount] [float] NULL,
	    [tripType] [int] NULL
	)
	CREATE TABLE [Sample Warehouse].[dbo].[diabetes]
	(
	    [AGE] [bigint] NULL,
	    [SEX] [bigint] NULL,
	    [BMI] [float] NULL,
	    [BP] [float] NULL,
	    [S1] [bigint] NULL,
	    [S2] [float] NULL,
	    [S3] [float] NULL,
	    [S4] [float] NULL,
	    [S5] [float] NULL,
	    [S6] [bigint] NULL,
	    [Y] [bigint] NULL
	)
END

Then, I added a copy task through the copy assistant for populating tables in the warehouse. Select the sample lakehouse and 3 relevant tables as the data source and sample warehouse as destination. For each table, select "Load to existing table" (if this doesn't show, make sure you've run the above stored procedure) and click next. Then enable staging with workspace as the data store type. I unticked the start data transfer immediately box, then clicked okay. This is just personal preference, but I like giving a once-over before saving and running myself. Make sure to add a stored procedure activity and connect it (on success) to your copy task.

Deployment Pipeline Setup

The last pre-requisite before getting in to the metadata driving this, is setting up the deployment pipeline. Click the workspaces ribbon (left hand side), and then new deployment pipeline. I've left the deployment stages as-is with development, test, and production, but you can tailor if needed. Once created, attach relevant workspaces to each stage and make sure to press the green tick to assign them.

You'll see some additional items in mine, these haven't been created just yet but are part of the next bit of development. For now, just make sure to select the sample lakehouse and warehouse, and both source and transformed pipelines then deploy to the test workspace. This might take a few minutes on the first deployment, but it usually gets quicker for subsequent deployments when the resources you're changing (e.g. Warehouse) already exist in the target workspace.

I'm not going to run anything in the test workspace, I just wanted to work through this to show what it actually looks like without any metadata driving the deployment. If you open the transformed pipeline in the test workspace, you'll see that the destination is still set to the warehouse in the development workspace.

That brings part 1 to a close, for part 2 of this blog I will walk through the necessary changes to paramaterise inputs that will automatically associate tasks with the current workspace's assets.

Fabric In Review: Thoughts 18 Months Later

Ali Stoops — Sat, 22 Feb 2025 01:08:15 GMT

Context & Intro

In the ever-evolving landscape of data analytics, Microsoft Fabric has emerged as a significant player, aiming to unify Azure Data Services to a single SaaS offering. This blog post reviews Fabric's journey, its unique selling points, areas for growth, and reflections on my experience. I’ve tried to keep thoughts relatively high level purely because delving in to all the good and bad things at a level of depth would create much larger lists.

Source: https://www.microsoft.com/en-us/education/blog/2025/01/microsoft-fabric-the-data-platform-for-the-ai-era/

Fabric’s Journey

Fabric's journey began in 2023 when, after a brief private preview, it was enabled in public preview for all tenants (May 2023), before its release to general availability in November 2023.

Microsoft also introduced the DP-600 and DP-700 or Analytics Engineer and Data Engineer certifications for Fabric, with DP-600 becoming Microsoft’s fastest-growing certification.

As of October 2024, Fabric touted over 14,000 customers (paying / non-trial) and over 17,000 individuals DP-600 certified. There has also been a significant investment in product development including Copilot integration, Real Time Intelligence, SQL Databases, industry-specific solutions (e.g. for manufacturing, healthcare, and ESG), and much needed improvement on functionality such as the eventual addition of Gen2 Dataflows CI/CD integration.

Fabric’s Unique Selling Points

OneLake: provides a single data foundation with one copy of data accessible across all services. This also enables things like OneLake Data Catalog, Direct Lake Semantic Models, and the One Lake explorer for easily accessing Fabric assets you have access to
AI Integration: Fabric’s ability to easily access AI experiences (model deployment, AI Skills) and Copilot to accelerate insights and development. For example, this can simplify and streamline the process of deploying an ML model and surfacing the output via Power BI (as captured by the predicted churn rate in the example report above). This really applies to the broad range of services (or “experiences”) but I think the combination of data engineering, analytics, and science through one interface is excellent
Minimising friction / barrier to entry: Fabric offers a familiar and intuitive experience for users, especially those coming from Power BI. As a Microsoft SaaS solution, being built around M365 makes user access straightforward, the range of pre-built connectors and other functionality, and being able to combine pro and low code solutions as well as run a variety of workloads from one place can all contribute to a reduced time to value. I think it’s noticeable when you work through your first few development experiences just how quickly you can get something prototyped - for those who are new, I highly recommend both Microsoft’s end to end scenarios and Alex Power’s Data Factory In A Day repo.
Governance and Security: It felt like a cop-out titling something “not just ETL and analytics” so I’ve tried to be more concise - many platforms require integration with other toolsets for item level lineage and orchestration, or cataloging data. Fabric enables governance functionality out of the box including monitoring through the capacity metrics app, workspace and item lineage, and row and object level security

Areas for Growth

Source: https://ideas.fabric.microsoft.com/ideas/search-ideas/?q=Workload%20management

Feature Lifecycle: While there’s been an incredible amount of progress made in terms of features released, and the Fabric release plan can be a help, there is still a remarkable number of features that are “preview” with no real idea what that means in terms of maturity (varies feature to feature). There are even some examples where there are two versions of the same feature such as “invoke pipeline” activity having a legacy and preview option, or Dataflow Gen2 having a GA version and preview (supporting CI/CD) version. For those working with new Fabric items, you will run into hiccups occasionally, but this is also tricky for new users to navigate. It would be great to see this improve, especially around core workloads
Capacity Management: When you purchase a capacity and attach workspaces to it, all workloads have been treated equally over the last 18 months. This causes challenges with a lack of functionality for queueing, workload management, and capacity consumption that need addressing. Currently, the main way to do this is with separate capacities
Cost Estimation: Estimating costs for capacity planning remains difficult. There is a calculator available in preview that I’ve had good experiences with, but for new or prospective customers, this is still a challenge
Solution or Deployment Best Practices: First, I appreciate that it’s not a vendor’s responsibility to define best practices, but I’ve found that the documentation can vary in terms of value. Some great articles are out there and easy to find like learning modules, demos, and specific development guidance (e.g. Python) as well as phenomenal community resources, but high level things supporting customer deployment like capacity management best practices, migration from SSAS multi-dimensional cubes, and moving from Power BI (e.g. gen2 dataflow performance) aren’t easy to navigate, and even at a lower level you can see examples such as MS documents like this one making users assume V-Order being enabled all the time would be the right thing to do. Sometimes this can necessitate the development of workarounds for specific challenges, custom solutions, or confusion related to making core decisions like how to manage CI/CD

My Experience

The above is a great snapshot of some individual experiences and the current online discourse around Fabric being truly “production ready” but over the last number of months I’ve struggled with some sweeping statements, both good and bad, around things Fabric can or cannot be used for. Of course, with data platforms, there are tradeoffs no matter what tooling you choose, and scenarios where certain tools are more closely aligned to requirements, but I wanted to add some generalised customer examples and patterns I’ve seen and why Fabric was viewed to be a good fit. There are some more examples I could add, but at that point it would be a blog on it’s own, so here are 4 examples, all of which I’ve seen first-hand:

Existing Microsoft (M365 & Azure) customers: I’ve separated this from “existing Power BI customers” for two reasons; first, I’ve seen organisations utilising Azure but not Power BI for their core BI workloads (e.g. where Qlik or Tableau have been used historically), and, second, is that drivers for implementation in this example differ compared to extending Power BI functionality (e.g. strategic modernisation, migrating from a multi-vendor environment, or adding Fabric as the single pane of glass across multiple data or cloud providers). For larger customers already in the Microsoft ecosystem, the tight integration with M365, ability to reuse existing infrastructure config (networking, landing zone), and being able to work with an existing vendor have all been seen as positive, as are ability to utilise Copilot and shortcuts supporting landing data cross-cloud. A couple of lessons learned are that, especially for the managed self-service pattern, it’s worthwhile considering implementing both Fabric-capacity-backed and Pro workspaces, and utilising pro-code experiences where possible (appreciating availability of skills and personal preferences) will offer benefits around capacity consumption
Small(er) Businesses - I know organisation size is all relative, but here I specifically mean where data volumes aren’t in the Terabytes and the data team is usually made up of people wearing multiple hats (engineering, analytics, architecture). SQLGene has posted their thoughts here, but my short summary here is that Fabric does offer a very easy way to just jump in, and can be a one stop shop for data workloads as well as offering flex in generally having multiple ways to get the job done as well as minimising the overall solution components to manage
Power BI customers looking to “do more,” without well established ETL capabilities, OR those migrating on-premises SQL (and SSIS) workloads - I’ve worked with a number of Power BI customers where there is already a good understanding of workspace admin and governance, the web UI, Microsoft SQL. This can make Fabric feel like an extension of something that’s already known and improve adoption speed. It’s common for many of these customers to also be looking to move some on-premises SQL workloads to the cloud, and being able to re-use data gateway connections is a plus. Some customers I’ve worked with have been able to migrate historic ETL processes by utilising copy jobs or dataflows for extracting and loading to Fabric before migrating SSIS stored procedures and creating semantic models rapidly (hours / days)
Greenfield scenarios - I think Fabric has a high value proposition in a number of scenarios, but I think it’s especially easy to understand this when you’re starting net-new. Alternate tech stacks may do specific elements really well, but you would need to stitch together more individual components, billing models, authentication methods etc. to meet engineering, analytics, science, architecture, and governance requirements. A good example was a project whereby there was a known requirement for analytics / reporting, as well as a data warehouse as well as minimal resources to support from central IT so being able to utilise a SaaS product that could meet functionality as well minimise infrastructure and operational support from IT was a big plus

Conclusion

Microsoft Fabric has been a fascinating addition to both Microsoft services and the data & AI technical landscape, but addressing some areas outlined above and core challenges in line with customer feedback will be crucial for its continued success. I would also highlight that the amount of development over the last 18 months and pace of updates has been impressive, so all of the above has to be taken in context of a technology offering that is still early in its product lifecycle and has massive further potential - in my opinion, Power BI was less impressive at a similar point.

An Introduction to CI/CD on Microsoft Fabric

Ali Stoops — Tue, 11 Feb 2025 23:27:05 GMT

Context & Intro

In December 2024, I gave a lightning talk at the Dublin Fabric User Group providing a brief introduction to Continuous Integration and Continuous Deployment (CI/CD) on Microsoft Fabric. In researching the topic as well as various customer conversations, I’ve noticed that while the Microsoft documentation is quite detailed, especially for implementation steps, there wasn’t a single overview of the CI/CD options available.

CI/CD are fundamental practices in modern software development, designed to streamline the process of integrating code changes and deploying applications. By incorporating CI/CD, development teams can enhance efficiency, reduce errors, and improve the overall quality of software products. In this blog post, we explore working with branches and CI/CD capabilities in Microsoft Fabric.

Before getting into it, I’ll just mention that this is intended to be a very brief introduction to those unfamiliar with CI/CD on Fabric. Other blog posts do a great job of describing practical scenarios and technical details - here my intention is to set a foundation in about 1,000 words.

What and Why

Azure DevOps Pipeline Architecture

You will often see CI/CD referenced as a single concept, ultimately resulting in automated deployment of software products or applications, but each element is distinct. Continuous Integration (CI) and Continuous Deployment (CD), is a set of practices in software development aimed at ensuring that code changes are automatically tested and deployed to production. CI involves developers frequently integrating their code changes into a central repository, where automated builds and tests are run. CD extends this by automatically releasing validated changes to production, allowing for frequent and reliable updates. There are plenty of resources describing CI/CD (e.g. this from Atlassian) and the benefits. Without sharing details, some of these include; early bug detection, improved collaboration, faster development, easier rollback, and reduced manual effort.

Working with Branches

Source: https://learn.microsoft.com/en-us/fabric/cicd/git-integration/manage-branches?tabs=azure-devops

In Fabric, workspaces are your governance and development boundary, and your central location for storing source code (Azure DevOps, GitHub) is enabled through git integration. When you start looking to use Microsoft Fabric Git integration at scale you are going to be working with both branches and pull requests. Branches are an interesting concept in Git - a branch looks and feels like an entirely new copy of your work. They allow you to work with your code in a new area within the same Git repository, e.g. through the use of a feature branch which is a branch you work with short-term for a specific feature. Once finished you apply your changes back to the branch you were working on another branch.

Typically, you make changes in one branch and then merge those changes into another. These should not be confused with deployment environments. Where you deploy to various environments to perform various levels of testing before deploying to a production environment. In reality, it does not make a full copy of your work in a new branch. Instead, branches and changes are managed by Git. Anyway, there are a number of different ways you can manage branches including using client tools or workspaces (described here).

Source: https://learn.microsoft.com/en-us/fabric/cicd/git-integration/manage-branches?tabs=azure-devops

CI/CD Options in Fabric

For CI functionality, Fabric natively integrates with either Azure DevOps or GitHub.

Source: https://learn.microsoft.com/en-us/fabric/cicd/git-integration/git-get-started?tabs=azure-devops%2CAzure%2Ccommit-to-git

As for CD, Microsoft has some guidance, and there are intricacies in how these are applied, but there are a few options here too; deployment pipelines, git integration, or REST APIs (CRUD).

Sample Microsoft deployment patterns from https://learn.microsoft.com/en-us/fabric/cicd/manage-deployment

From what I’ve seen, the direction of travel is a preference for the bottom option, utilising deployment pipelines, but due to challenges with functionality until recently, most people using CI/CD on Fabric in anger are using some customized version of git integration - some examples I’ve seen show the most complex environments using native Git API integration, but this seems to be less common.

In reality, this all depends on your branching model (Trunk based, Gitflow) as well as requirements such as whether you want to trigger deployment from Fabric or Git.

State of Play

EDIT: In February 2025, Mathias Thierback shared an updated version of his CI/CD support matrix That supersedes the below. The remaining challenges and lessons learned are mostly still relevant, but please do visit here for the update.

First of all, shout out to Mathias Thierback who not only created a version of this including Terraform, but has been presenting on all things Fabric CI/CD over the last month. Additionally, I haven’t considered Airflow Jobs. As for those marked proceed with caution, I’ve explained why in the final section of the blog.

Lessons Learned & What’s Next

I’ve seen a number of challenges with CI/CD on Fabric. Most issues can be maneouvered around, utilising Metadata driven pipelines for example, but it’s worth mentioning. For example;

Dataflows do not support deployment pipeline rules
Notebooks don’t support parameters
Where default lakehouses change across workspaces, the attachment doesn’t propagate

As for other lessons learned:

The REST API offers most flexibility, but is the most complex to implement, whereas deployment pipelines offer the least flexibility but are the most straightforward to implement
While there is support across a range of items, in my experience, notebooks really are the way to go for version control and CI/CD native integration to minimise potential issues
REST API implementation is only possible for all items through users rather than Service Principals. This is being improved upon over the last couple of months but isolated to Lakehouses and Notebooks at the time of writing
Unsurprisingly, good general CI/CD practices apply (e.g. early & often releases)
Some limitations exist (e.g. CI/CD or workspace variables, pipeline parameters being passed to DFG2 or pipelines invoking dataflows not supported)
Some specific guidance around ADO and GH e.g. regions and commit size are available on MS Learn. ADO recommended
When you have a dataflow that contains semantic models that are configured with incremental refresh, the refresh policy isn't copied or overwritten during deployment. After deploying a dataflow that includes a semantic model with incremental refresh to a stage that doesn't include this dataflow, if you have a refresh policy you'll need to reconfigure it in the target stage
This blog hasn’t touched upon two important areas for implementation including naming conventions (this post from Kevin Chant is great) and workspace structure (many posts online, but I’ve also written this blog)

My Top 5 Power BI Updates of 2024

Ali Stoops — Mon, 20 Jan 2025 19:28:06 GMT

Intro & Context

With the January announcement of Power BI introducing Tabular Model Definition Language (TMDL) view, it reminded me that I had a written but unpublished blog about 2024 Power BI updates. Just a couple of weeks into 2025, it's a good time to look back at the remarkable enhancements introduced in Power BI over the past twelve months. Power BI has consistently evolved over the last 8 years or so, offering users more powerful and intuitive tools to transform data into insightful visual stories. This year has been no exception, with a host of features varying from simple quality of life updates to fundamental improvements to the development workflow. The impressive growth in both functionality and user base underscores Power BI's position as a leading tool in the realm of business intelligence. In this blog, I will highlight my top ten favourite Power BI updates from 2024 that have significantly enhanced the Power BI developer experience.

Whether mentioned in seriousness or in jest, there was a lot of noise around the announcement of dark mode being available in Power BI but (spoiler) it hasn’t made my list.

My Top 5 Updates

Core visuals vision board - I’ve included the core visuals board on this list for 2 reasons; first, it’s great to see Microsoft engaging the community and customers for input around upcoming Power BI developments and enabling not just an easy way to digest development ideas but vote on them and, second, I often think it’s difficult to see really good examples of well laid out and aesthetically pleasing Power BI dashboards which this definitely is.
Copilot - I’m cheating a little here in combining some features, but Copilot functionalities including enhanced report creation, summaries in subscriptions, Copilot for mobile apps, and Copilot measure description all feel like they’ve massively improved the experience of using Copilot in Power BI. If I had to only call out one here, it’s a tough call between enhanced report creation and summaries in subscriptions, but I think I would go for enhanced report creation as I think it’s improved the developer experience.
Live edit of Direct Lake models in PBI desktop - I appreciate this one blurs the line between a Power BI update and Fabric in that you need to have published a semantic model to a Fabric capacity, but this felt like a really nice quality of life improvement for anyone who prefers to work with Power BI Desktop and is already publishing or managing Direct Lake semantic models.
Tabular Model Definition Language (TMDL) made generally available - I mentioned the 2025 update including TMDL view, so it’s likely some readers might have seen this one coming, but the GA release of TMDL was a welcomed addition. As well as being a pre-requisite for the TMDL view, the improvement here in source control for semantic models felt important.
Paginated Report Authoring via the web GUI - until the preview announcement of the new paginated reports authoring experience, the method most would use for creating paginated reports was through Power BI Report Builder. Though I personally prefer Power BI Desktop to the web development experience, I think anything to simplify the tooling and development experience is, in general, a positive step.

Conclusion

I easily could have covered a top 10 here with some special mentions (e.g. DAX Query view for web), but I wanted to keep this list reasonably concise. One thing I will call out as I reflect on 12 months of updates is that there were so many positive Power BI developments which was great to see even if it’s hard to always stay on top of the what’s new.

Microsoft Ignite Debrief for Data & AI

Ali Stoops — Thu, 19 Dec 2024 13:49:54 GMT

Intro & Context

Microsoft Ignite 2024, Microsoft’s flagship annual event, took place a few weeks ago, at the end of November. Having had time to digest the various sessions and announcements, I wanted to share my highlights and key takeaways related to data and AI. I won’t be covering all announcements as I’m keen to share some personal views on 5 or so areas and a list of what I consider to be the more interesting things announced but, for those interested, the Microsoft book of news is the one-stop-shop for all announcements. Fabric-specific announcements can be seen here.

It’s worth mentioning that most of the announcements below, other than those around general availability, aren’t usually available immediately. Some minor updates are but otherwise you can expect most features to be coming in 2025.

My Key Takeaways

Fabric Databases enabling PBI writeback

SQL Databases in Microsoft Fabric - SQL Server workloads made available through Fabric, extending analytical capability to include a Fabric data store for transactional workloads. While this is interesting on its own merit in the sense that it offers a way for organisations to consolidate transactional and analytical databases, the real reason I’ve included this as one of my favourites was down to the additional announcement that this will enable functionality in 2025 for Power BI writeback where users can input data through Power BI and write to a backend database - awesome.
Purview Investment - there is a whole section in the book of news on Purview that’s worth digesting - a couple of specific call outs include renaming Purview Data Catalog to Purview Unified Catalog, extending data quality support for Fabric, Databricks, Snowflake and more, extending DLP capability, and Microsoft Purview Analytics in OneLake. In my opinion, though, the key takeaway here was that Microsoft are clearly investing heavily in Purview. I would argue that’s somewhat overdue, but it will be an interesting space to keep a close eye on over the next 6-18 months
Azure AI - Azure AI Foundry is a rebranding and extension of functionality for the existing Azure AI Studio, AI Foundry is also going to provide prebuilt templates and an AI Agent service. Other AI announcements include AI reports covering impact assessments covering project details such as model cards, model versions, content safety filter configurations and evaluation metrics, an AI scenario in the Cloud Adoption Framework, and an Azure Well Architected AI scenario.
Copilot - I’m reasonably confident that Copilot will be part of most, if not all, Microsoft events for the foreseeable future. Broader announcements included things like Agents in SharePoint and Project Manager Agent (that I’m interested to try), but I think my takeaway from Ignite was that there was a focus on customising, integrating, and extending purpose built Copilots. I felt this was slightly more targeted than the usual Copilot off the shelf type of conversation. From the Data and AI angle, the integration between Copilot Studio and AI foundry is interesting as it unlocks the capability to utilise AI Search as well as bringing your own models (or using industry specific models). Since Ignite wrapped up, Microsoft also announced GitHub Copilot free for VSCode.
OneLake Catalog - “a complete solution to explore, manage, and govern your entire Fabric data estate.“ I see this as bridging the gap between existing management functionality (e.g. lineage) and what sits in Purview, and I’m sure that many users who use OneLake Catalog extensively will be attracted to further data quality and security functionality of Purview. For me, two key improvements OneLake Catalog brings are the ability to see metadata for all OneLake assets - not just lakehouses but pipelines, notebooks, etc. - such as tags, lineage, and the ability to view previous runs, and the new “govern” tab that gives a view of your whole data estate including sensitivity label coverage

General Availability of sustainability data solutions in Microsoft Fabric - it’s always refreshing to see some Fabric functionality get the GA label, but alongside that, I think the current ESG solutions through Microsoft are either the Sustainability Manager ($4k per month), CSRD reporting templates through Purview (limited scope), or to build your own. This could be a very interesting way to meet ESG data and reporting requirements through Microsoft Fabric. There was a spokesperson from an existing customer who mentioned using this capability of implementing new use cases in short sprints within two to six weeks - I can’t wait to get my hands on it
Fabric AutoML (or AI Functions in Fabric) - I got a preview of this during a couple of sessions at FabCon Europe in September, but Ignite marks the preview availability of auto ML, enabling implementations of various algorithms and AI functions in just one or two lines of code. This also ties in to a study Microsoft funded that, unsurprisingly, indicated the biggest challenge with AI adoption is skills. It’s always a fine line in balancing low-code capability and maintaining the quality of deeply technical products, but I personally welcome the AutoML functionality to support some rapid prototyping work

Other Announcements

Tenant switcher - for many people, this will seem inconsequential, but as someone that works across different tenancies in the nature of my role, this is such a quality of life update to be able to switch tenant without having to log out and log in
The general availability of external data sharing allows you to directly share OneLake tables and folders with other Fabric tenants in an easy, quick, and secure manner
Fabric Billing Update - organizations with multiple capacities can now direct Copilot in Fabric consumption and billing to a specific capacity, no matter where the Copilot in Fabric usage actually takes place. It’s worth noting this means you would still need an F64+ capacity, but Copilot could be triggered from an F4 for example. In an ideal world, this would be further improved so having 2xF32 capacities (i.e. anything totalling F64+) would also be able to utilise copilot, but it’s good progress
Fabric Surge Protection - helps protect capacities from unexpected surges in background workload consumption. Admins can use surge protection to set a limit on background activity consumption, which will prevent background jobs from starting when reached
SQL Server 2025 - now available in preview
Fabric General availability announcements - the most notable is probably Fabric Real Time Intelligence, but this also includes sustainability data solutions mentioned above, the API for GraphQL, Azure SQL DB Mirroring, the Workload Development Kit, and external data sharing. All great to see
Fabric Workspace monitoring - detailed diagnostic logs for workspaces to troubleshoot performance issues, capacity performance, and data downtime
Fabric integration to Esri ArcGIS - preview of integration with Esri ArcGIS for advanced spatial analytics
Fabric Open Mirroring - a feature that allows any application or data provider to write change data directly into a mirrored database within Fabric, released to preview
Power BI core visuals - okay, this isn’t quite available yet, but it’s interesting to see that Microsoft are sharing a bit more about upcoming visualisations for Power BI, and also that it looks so aesthetically pleasing. The core visuals vision board is available here

Minimising Spark Startup Duration in Microsoft Fabric

Ali Stoops — Tue, 19 Nov 2024 00:52:29 GMT

Context

Often with cloud services, consumption equals cost. Microsoft fabric isn’t much different, though there is some nuance with the billing model in that, in some cases, increasing consumption by 20-30% could double the cost due to the need to move to a bigger and in other cases you might have the overhead and see no cost increase. I‘m keen not to get into the depths of SKUs and CU consumption here, but, at the most basic level for spark notebooks, time / duration has a direct correlation with cost and it makes sense generally to look for opportunities to minimise CU consumption.

In terms of where this becomes relevant in relation to spark startup times in Fabric, it’s worth noting that this duration counts as CU(s) consumption for scheduled notebooks, and also increases the duration of each pipeline.

I’ll start by sharing a couple of screenshots with session start times where high concurrency sessions and starter pools (details below) aren’t used. After running each half a dozen times, the start times were almost always over 2 minutes and up to 7 minutes with an average of around 3 minutes.

Environments

Custom pool with small node size

Before jumping in to a couple of recommendations and examples, I also wanted to comment briefly on Fabric environments. Environments can be used to consolidate and standardise hardware and software settings. The Microsoft documentation has more information on this. Up until running a series of testing for this blog I had mainly used environments for deploying custom python packages, but you’ll see a custom environment in some screenshots below (for the small node size) where I adapted environment settings to quickly enable changes in spark compute resources and apply them consistently across sessions, without changing the workspace default, for testing the high concurrency sessions with specific compute resources.

Custom pool with medium node size

Basic Testing

Having run sessions without utilising high concurrency or starter pools for a range of environments, the results are outlined below;

Small node size, memory optimised, 1-10 nodes - 2 minutes 42 seconds
Medium node size - this one was interesting. If you create a custom pool with similar settings to the default starter pool, startup can be around 10 seconds, but minor adjustments to the pool, namely adjusting number of drivers or executors, or memory from 56GB to 28GB, saw this jump to 7 minutes 7 seconds
Large node size, memory optimised, 1-6 nodes - 2 minutes 17 seconds

Small node size (demo environment details in first image above)

Medium node size, custom environment settings can be seen in the environment section

Large node size

High Concurrency

Connecting to high concurrency sessions

High concurrency mode in Fabric enables users to share spark sessions for up to 5 concurrent sessions. Though there are some considerations, namely around the requirement for utilising the same spark compute properties, the Microsoft documentation suggests a 36 times faster session start for custom pools. In my experience, the actual start time was even quicker than suggested, almost instantaneous, compared to around 3 minutes, and across 3 tests this ranged from 55 times faster to almost 90. That said, it’s also worth noting that first high concurrency session start was often slightly longer than starting a standard session where it was more like 3 minutes than 2.5.

Startup of the first high concurrency session

In all node size variations, the startup times for further high concurrency sessions was either 2 or 3 seconds. The images below were taken for the demo environment outlined above (small node size).

Startup for attaching the second high concurrency session

Starter Pools

Fabric Starter Pools are always-on spark clusters that are ready to use with almost no startup time. You can still configure starter pools for autoscaling and dynamic allocation, but node family and size are locked to medium and memory optimised. In my experience, startup time was anywhere from 3 to 8 seconds.

Startup time utilising starter pools as the workspace default

Closing Thoughts

In short, where you’re comfortable with existing configurations and consumption, or no custom pools are required, look to utilise starter pools. Where custom pools are required due to tailoring requirements around node size or family, and multiple developers are working in parallel, aim to use high concurrency sessions.

Power BI Pricing Update

Ali Stoops — Wed, 13 Nov 2024 00:03:10 GMT

Context

Yesterday, November 12th, Microsoft announced changes to the Power BI licensing that represents a 20-40% (license dependent) increase per user license. It’s worth mentioning that this is the first increase since July 2015, more than 9 years ago. From April 1st 2025, pro licensing will increase from $10 to $14 per user per month and premium per user licensing from $20 to $24 per user per month.

More details are covered in the Microsoft blog post.

What’s not affected

Though this is naturally going to affect a large number of users, I think it’s most likely to impact small and medium sized corporates and those not currently or planning to use Fabric. This is because some elements will remain unaffected as the licensing changes are specific to the per user licensing not under enterprise agreements. Not changing, is:

Fabric F SKU pricing
Embedded pricing (under EM and F SKUs)
E5 licensing - this still includes a power BI pro license with no increase in cost
PPU add on licensing for E5
Non-profit licensing (currently priced lower than enterprise or personal licenses)

Scenarios & suggestions

Excluded / unaffected: If you’re currently E5 licensed, licensed through a non-profit, or utilising included in pricing viewer licensing through an F64 Fabric capacity, there’s nothing to worry about
Utilising Fabric, but below F64: the new licensing adjusts the tipping point for where the jump to F64 makes sense for the viewer licensing being included in capacity cost. With reserved pricing, this was previously around 250 users if you are utilising an F32 capacity with reserved pricing and pro licensed users, now it’s more like 180. The crossover point is the difference in cost from your current SKU to F64 divided by the per user per month license cost (e.g. 268 for F16)
Utilising fabric but not embedding: more often than not, Power BI users are licensed for accessing reports via the Power BI service. However, with Fabric F SKUs, you can make the most of embedding for your organisation and organisational apps to facilitate consumption of reports without needing to access Power BI service and therefore reducing the potential licensing requirements for viewers
Utilising Power BI but not yet Fabric: Well, both of the above two points are still worth considering. In fact, I think the lower SKUs (F2 and F4) could be paid for if you’re able to utilise embedding for your org instead of Power BI licenses for report viewers for as few as 11 (F2, reserved pricing) users. This could be a great reason to consider Fabric if you’re not already

As for everyone else, unfortunately there isn’t much of an option beyond preparing for the increased cost sooner than later and communicating it to decision makers. That said, I would hazard a guess that the cost to transition organisational reporting would likely outweigh any benefit, and given the previous history I would hope this is not likely to happen again for some time.

Microsoft Fabric Data Engineer Associate (DP-700) Beta - My Experience & Tips

Ali Stoops — Mon, 04 Nov 2024 01:07:21 GMT

TLDR

DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric is a challenging exam that covers most aspects of data engineering on Microsoft Fabric. Personally, I consider it a tougher exam than DP-600 and I believe it could do with some rebalancing around examining more spark and pipeline / orchestration topics, but all topics felt relevant and there wasn’t too much variation in question complexity with one or two exceptions.

That said, it’s likely existing Fabric data engineers are more familiar with some experiences than others, so there is probably some learning that’s needed for most people - especially those with limited real time intelligence (or KQL) and error resolution experience.

I expect some of the balancing to be addressed for the more left-field questions when the exam goes live, based on feedback from those undertaking the beta. Even though DP-700 has a large overlap with high level topics to DP-600, I think the exam considers them from different angles or contexts and should have good alignment with those looking to prove their data engineering skills with Fabric.

What is a beta exam?

Exams in beta are assessed for performance and accuracy with feedback on questions being gathered. The beta process is slightly different with AWS exams (described here) than Azure - for this beta exam there isn’t a guaranteed discount (usually 50% with AWS), the beta period duration is not clearlyy defined ahead of time (end date isn’t currently published), and the results aren’t released 5 days after taking the exam but around 10 days after the exam exists its beta period so I don’t yet have my results. Microsoft publish some information on beta exams here.

What is the Fabric DE certification and who is it aimed at?

Aside from the obvious role alignment in the certification title, the Microsoft documentation describes the expected candidate for this exam as someone with “subject matter expertise with data loading patterns, data architectures, and orchestration processes” as well as the below measured skills:

Implement and manage an analytics solution (30–35%)
Ingest and transform data (30–35%)
Monitor and optimize an analytics solution (30–35%)

One thing I would call out around the ingesting and transforming data is that the exam covers all mechanisms of doing so - notebooks, stored procedures, data flows, KQL Querysets - and utilising Spark, SQL, and KQL.

Exam prep

At the time of the beta exam being available, there wasn’t a formal Microsoft Learn training path. Beyond a few blogs from some Microsoft Data Platform MVPs, the only real collateral that exists is a collection of Learn modules. For those who undertook the DP-600 exam before November 15th 2024 (see some blogs about the updates here and here), this collection is mostly similar to the DP-600 learn modules. Additions include “Secure a Microsoft Fabric data warehouse”, three Real Time Intelligence modules (getting started with RTI, using Real Time Eventstreams, and querying data in a KQL database) as well as “Getting started with Data Activator in Microsoft Fabric.” Beyond this, the only real preparation I could suggest is a reasonable amount of hands on development experience across data engineering workloads and development experiences. Though it’s also worth saying that the suggestions from my DP-600 blog still apply.

My experience and recommendations

Most of the reason I wanted to publish this was to cover exam topics, but before doing so there are three key things worth calling out:

Get used to navigating MSLearn - you can open an MSLearn window during the exam. It’s not quite “open book” and it’s definitely trickier to navigate MSLearn using only the search bar rather than search engine optimised results, but effectively navigating MSLearn means not always needing to remember the finest intricate details. It is time consuming, so I aimed to use it sparingly and only when I knew where I could find the answer quickly during the exam. I also forgot that this was possible so missed using it for the first 25 questions
Though it’s somewhat related to time sent navigating MSLearn as above, I did run quite tight on time and only had about 7 minutes remaining when I finished the exam so use your time wisely
Case studies were front-loaded and not timed separately. It’s a delicate balance to be struck, but you can’t go back to the case studies so want to make sure you spend enough effort on them but it’s worth being careful to not waste time for the remaining questions. For reference, I spent about 20 minutes on the 2 case studies

As for the exam topics:

I observed a number of questions focused on error resolution across a range of examples such as access controls (linked to the item below), SQL queries, scheduled refreshes, and more
Be confident around permissions and access control. Though they’re accessible via MSLearn, I’d recommend memorising the workspace roles (admin, contributor, member, admin), but also consider more complex requirements such as row level security, data warehouse object security and dynamic data masking (including evaluating predicate pushdown logic)
Though it could be covered above, I would also suggest having some experience in testing scenarios related to workspace governance and permissions such as configuring various read and write requirements across multiple workspaces via both roles and granular access requirements. I think some questions extend this beyond a simple security question and more into a question of architecture or design
A broad understanding of engineering optimisation techniques is helpful, but I would recommend having a deeper understanding and hands on experience of read optimisation and table maintenance including V-Order, OPTIMIZE, and VACUUM
Deployment processes - pipelines, triggering orchestration based on schedules and processes, and understanding cross workspace deployment, but also azure DevOps integration
Experience selection - at face value, the notebook vs data flow vs KQL and Lakehouse vs warehouse vs event stream seem straightforward but as always the detail is crucial. Be aware of scenarios outlining more specific requirements like a requirement to read data with SQL, KQL, and spark but only write with spark or choosing between methods of orchestration such as job definitions vs. pipelines
Intermediate SQL, PySpark, and KQL Expertise is required - I noted intermediate SQL and beginner PySpark being important for DP-600. Here, both are still true and perhaps more intermediate PySpark is required, but KQL experience is needed too, and I had quite a few code completion exercises across SQL, spark, and KQL with a mix of simple queries, intermediate functions like grouping, sorting, and aggregating, and more advanced queries including complex joins, windowing functions and creation of primary keys. I also had one question that was around evaluating functionality of a few dozens of lines of PySpark code with multiple variables and try/except loops - I felt the complexity of questions was much higher than in DP-600, but it’s hard to know whether this will remain post-beta
I’ve already mentioned a few times above, but Real Time Intelligence was scattered across a number of questions. Alongside understanding various real time assets and KQL logic, a number of scenarios followed a similar pattern around sourcing data from event hub, outputting to lakehouses and implementing filters or aggregations, sometimes with a focus on optimisation techniques for KQL
Understand mechanisms for ingesting external data sources. Though seemingly obvious for a data engineering exam, a couple of things that I would suggest being confident around are PySpark library (or environment) management and shortcuts, including shortcut caching
Capacity monitoring and administration including cross-workspace monitoring, checking running sql query status, and interacting with the Fabric capacity monitoring app

AWS Certified AI Practitioner Beta - My Experience & Tips

Ali Stoops — Fri, 01 Nov 2024 01:03:46 GMT

TLDR

Though an interesting learning experience, and something I wouldn’t actively discourage those interested from, the beta AWS certified AI practitioner felt a little bit caught in the middle and isn’t something I’d recommend for most audiences. I think it examined things like model selection and broader AWS services including IAM and networking to be more in depth than necessary for most business or sales people, and not in depth enough to be practically useful for practitioners.

That said, I feel there is some value for those potentially interested in layering their knowledge and certifications and plan to undertake the next level of AI/ML AWS certification but haven’t taken a cloud or AWS exam before. For most other people, there is likely a better starting point.

What is a beta exam?

I covered this under my associate data engineer exam blog, and nothing really changed since posting that. Two things worth mentioning are that AWS are currently offering a free retake before February 2025 if you fail the exam and that they are offering an additional “early adopter” badge for those that achieve the certification during the beta phase or first 6 months after. An additional point that isn’t strictly beta related, but something I noted, was that the certification expires after 3 years as with all other AWS certifications, but where certified cloud practitioner is renewed by passing any other exam, I’m not clear on (and doubt) the same thing occurs for the AI practitioner cert.

What is the AI practitioner certification and who is it aimed at?

From the AWS website;

“The ideal candidate for this exam is familiar with AI/ML technologies on AWS and uses, but does not necessarily build AI/ML solutions on AWS. Professionals in roles such as sales, marketing, and product management will be better positioned to succeed in their careers by building their skills through training and validating knowledge through certifications like AWS Certified AI Practitioner.”

And I think this is where I will call out my main gripe with the exam - I’m not sure the content is completely aligned to the recommended candidate in that some of the topics I will call out below are areas I wouldn’t expect someone in sales or marketing and perhaps not product management to really get value from - model selection, s3 policies, VPCs, and Bedrock internet restrictions. I would at least acknowledge that the guidance on this being a certification not aimed for those building AI/ML solutions seems right, and that there is a good amount of content from the certification that is more aligned to the proposed audience such as understanding the differences between fine tuning and prompt engineering, so perhaps some of what I saw in the beta won’t actually be included in the generally available exam.

Exam prep

At the time I undertook the exam, there wasn’t much collateral beyond the exam guide and the SkillBuilder plan had cost involved. Given this was a practitioner level certification and I was already familiar with both AWS technologies and AI concepts and technologies, I decided to take the exam with no preparation.

Since then, Stephane Maarek has released his Udemy course. This looks like it has very good coverage of the topics examined and, though I can’t speak to its quality, I have had positive experiences with other courses Stephane has released and I would expect this to be a good resource.

My experience and recommendations

One thing I would call out is that the exam felt almost entirely generative AI focused and there wasn’t much coverage of things like natural language processing, classification, predictive techniques, or AI concepts outside large language model applications.

Understand the AI deployment process (build, train, deploy) and AWS tools you can use aligned to these. I did also notice some specific questions around EC2 types for training so I would be aware of HPC and Accelerated computing types (especially Inf)
Though it’s an AI certification, understanding some core AWS concepts is absolutely required. I would recommend, based on my exam experience, having more than just a conceptual understanding of IaM and S3 policies, VPCs, and security groups
Understand the core AWS AI tools and technologies - I think the most common areas are related to having a reasonable understanding of Bedrock and Sagemaker (as well as when to use one over the other), but other things also included integration with broader services such as Connect, and knowing the use cases for Macie vs Comprehend vs Textract vs Lex
Commit the Sagemaker features to memory - it’s less than 20 features by 6 categories, but I found that this came up in quite a few questions
Prompt engineering is likely to come up more than once, so if you’re unfamiliar with AI prior to looking at this certification I would suggest reading up on prompt engineering concepts and approaches. I think this AWS site is a good place to start, but the questions are often phrased around determining the most effective approach so this is an area where comparing and contrasting methods is important
Understand both fine tuning and pre-training of LLMs and the difference between the two i.e. when it’s preferable to conduct each
I mentioned above that observed some specific, and more than surface level, AI questions around model selection and adjustment processes. I would suggest being familiar with
- BERT - and relevant applications such as filling blanks in sentences
- EPOCHS and adjusting them in line with observed over and under fitting
- Temperature - common word choices, consistency, etc. (context here, and AWS guidance here)

Microsoft Fabric Sample Architecture Diagram

Ali Stoops — Sat, 12 Oct 2024 14:38:07 GMT

I was recently preparing a presentation for an introduction to Microsoft Fabric during which I wanted to briefly talk about where Fabric fits in a typical hub and spoke Azure Landing zone as well as showing the end-to-end processing of data in Fabric for downstream consumption. Admittedly, some high level diagrams exist for the latter, but I wanted to present this slightly differently as well as showing processing of multiple data types - I always find it helpful to consider these visually.

Fabric in Azure Landing Zones

First, a few core concepts (Microsoft information). Microsoft Fabric is enabled at the tenant level, and must be attached to a subscription, resource group, and region. In the below example, I’ve aligned the Fabric capacity to a “data and ai” subscription in a landing zone spoke. It’s worth mentioning highlighting that it’s entirely possible for the Fabric box (green dotted line) to exist without any additional complexity, but I’ve included some examples like private endpoints to highlight that Fabric can be configured to meet more complex security or networking needs - see the network security documentation for more information. Lastly, alongside using the Microsoft backbone network for integrating with any Azure resources, Fabric enables native access to any data the user has access to at a tenant level, so you could imagine the dotted green box being extended to other subscriptions.

Fabric-specific architecture

A simple overview of the medallion architecture is available via Microsoft (image below). I also stumbled across an automotive test fleets example Microsoft shared, but could only see the base image that I wanted to extend and have something editable for going forward.

So, alongside my visual updates, the below visual combines these two examples to provide an example of implementing multiple types of ETL processes and consumption.

Finally, the real value I wanted to share here is the Visio file (from this GitHub repo) for the above Fabric diagram so you can use the icons or adapt to your needs, as this was the most time consuming part of my preparation as I appreciate that the concepts are already covered through the various content openly available but only with specific images that aren’t always the best for visual representation. I’ve stored this on GitHub, but please also share any suggestions or feedback and I’m happy to produce other examples. Note, the base landing zone diagram was authored by a colleague before I added the data & ai subscription so I’m not comfortable sharing that openly just yet.