In recent years, an increasing number of customers have requested options to extend retention in Microsoft Defender XDR beyond the default 30 days at a low cost, all with the requirement of having the KQL experience available.

Blog information: Feature is currently in General Available

Blog update: 27 october 2025

By default, Defender XDR retains incidents, alerts, and related data for 180 days, while Advanced Hunting data is limited to just 30 days. For proactive threat hunting and compliance, these retention periods are often too short – making extended storage essential.

Customers normally need to keep data accessible for longer than three months. In some cases, this is just due to regulatory requirements, but in some other cases, they need to be able to run investigations on older data for hunting.

Previous solutions

Previously, there were a couple of options with the use of the streaming API in Defender XDR and Azure Data Explorer:

  • Azure Data Explorer
  • Stream data directly to Log Analytics
  • Stream data directly to a Storage account/ event hub

Azure Data Explorer

In the past years, I deployed many Azure Data Explorer clusters for smaller companies and larger enterprises with up to 6TB ingested daily. Previously, it was quite hard to have a solution where the KQL language is available at a low cost. Azure Data Explorer is a great solution without relying on Microsoft Sentinel for the expensive analytics tier ingestion cost.

The downside of Azure Data Explorer is the maintenance and complexity of setting up a cluster. In short, it requires:

  • Eventhub or Storage account
  • Maintain Throughput Units to handle the events per second to avoid data loss
  • Azure Data Explorer cluster
  • Azure Data Explorer sizing
  • Need to maintain and track performance/ ingestion
  • Need to be parsed and transformed
  • Tables need to be maintained
  • Active reporting to track performance peaks
  • Maintenance for new tables and data schemas
  • And many more

So keep in mind that running Event Hub Namespaces and Azure Data Explorer clusters requires additional troubleshooting/ operational tasks and active reporting. When the people and knowledge are available, it is a great solution for relatively low cost – it seems still cheaper in comparison with the new Sentinel Datalake, all keep in mind – it requires operation and more maintenance to keep it running. It is far from managed.

Log Analytics/ Microsoft Sentinel

Streaming logs directly in Log Analytics/ Microsoft Sentinel is the easiest way – and also the most expensive way of ingesting and storing data for archiving purposes. You’ll be billed for the ingestion into Sentinel before these logs can be stored. And ingestion in Microsoft Sentinel is not cheap. The benefit is that Microsoft is automatically creating the mappings and doing all the data transformations as part of the ingestion.

Downside: for 1TB a day in the Analytics tier, it costs only for the ingestion without any retention $3207 a day. That is over $95.000 for 30TB of ingestion each month.

Without any discount or reservation tier. Based on the pay-as-you-go ingestion cost. All still with additional discount or reservation tiers, it is still expensive for just storing data longer.

Why is it a challenge?

Advanced Hunting is included in Defender XDR Advanced Hunting – this data is part of the Defender license and is already paid. All the hunting and detections can be created directly on top of Defender XDR dataset – with the migration from Sentinel to the Unified SecOps platform, it is more clear that Microsoft follows the direction to put all the analytics and detections in Defender XDR. So the data is available in Defender XDR and mirrored to Sentinel, where you pay the expensive ingestion cost of Sentinel. That can be better and more efficient without ingesting data in Sentinel.

What do we need?

It is simple: use the Defender XDR dataset for running queries/ analytics and all the analytics functionalities, and store the data cheaply for longer retention without the need to ingest data first in Microsoft Sentinel:

What we want: Logs directly from Defender XDR to the Sentinel Datalake without expensive ingestion in the Analytics tier.

What we don’t want: Since there is an expensive ingestion cost in the analytics tier for each GB in Sentinel. Of course, when many detections and use cases are part of Sentinel this is a different story.

Solution: Logs via Workspace Transformations DCR directly in the Sentinel Datalake custom tables.

Let’s go and see what is possible with Microsoft Sentinel Data Lake


The new solution: Microsoft Sentinel data lake

Microsoft Sentinel data lake is the new solution, which is cost-efficient and a way to store data for years as part of the Data Lake. Sentinel Data Lake was announced on July 22nd 2025, marking a new chapter for log management in the Microsoft security ecosystem.

Microsoft says the following:

“Sentinel data lake simplifies security data management, eliminates security data silos, and enables cost-effective long-term security data retention with the ability to run multiple forms of analytics on a single copy of that data”.

In short, it means that the Sentinel data lake is a managed data lake and is managed by Microsoft. Microsoft is maintaining and scaling the backend and making sure the performance is scalable and efficient for the amount of data.

Read my in-depth blog with all the Sentinel Datalake information and features: Microsoft Sentinel Data Lake: How to use/enable and set-up the unified datalake

How works the ingestion?

With the release of the new data lake, Microsoft released a new feature named Table Settings, where it is possible to manage the ingestion pipeline. In the Table Settings, we can configure the ingestion tier (analytics or data lake).

All now the trick – it depends a bit on which table you’re configuring – not all the tables can be configured directly to be ingested in the Data Lake tier. The following tiers are available;

Analytics tier: Data is ingested in Log Analytics mirrored to data lake, and stored in the data lake for longer retention.

Data lake tier: Data is directly ingested in the data lake and never goes to the Log Analytics and pricing is cheaper.

When switching to data lake tier only, the table is changed from the Analytics Tier to the data lake tier. All keep in mind – it will have fewer analytics functions and is a bit slower, but still queryable via KQL. Data lake tier is the table type for the Data lake.

XDR data

This blog is focused on storing Defender XDR cheaper. When you go to the Table Settings in Defender XDR you will see there is no option to store the data directly in the data lake. The only option is the following:

Analytics tier only

When configured as in the above screenshot, it will not create any table/events in the data lake (only when the retention or long-term retention is set above 30 days.

When setting the following:

With 30-day analytics (as part of Defender XDR) and 1-year total retention, it will perform the following:

  • In the Sentinel Defender XDR connector, the DeviceProcessEvents will be streamed to LAW/ Sentinel
  • Data will go first to Analytics (Which means the data will go first in Sentinel Log Analytics and become part of the Analytics tier before it is sent to the datalake)

Conclusion: Additional cost for the ingestion, and still based on Microsoft Sentinel ingestion as part of the connector. Not ideally, since the data is ingested in the expensive Analytics tier first. The “included in license” is focused on the Defender XDR Advanced Hunting data.

Skipping Sentinel analytics ingestions

Ideally, since the Defender XDR data is part of the Defender XDR Advanced Hunting dataset, we want the data there for 30 days and later streamed to the datalake without the Sentinel ingestion step. All the downside – we cannot switch the table directly to the Datalake tier only. All there is a solution:

What is possible is creating a copy of the DeviceProcessEvents and naming it DeviceProcessEventsDL_CL for example.

Name convention: DL: Datalake CL; Customlog

The E5 benefit Sentinel ingestion of 5MB/User/Day can be used for some tables, and only switch the big ones to the custom tables to leverage the discount. Since replication of Analytics to data lake is “free”

Table creation tool

Table Creation can be hard since the table mapping needs to be available/ mirrored from the original table structure, or easy with the use of the tableCreator.ps1 script created by Marko Lauren. (Github link to the tabletcreator.ps1)

Important: The data lake tier does not support the Dynamic type out of the box. When not giving any parameter, it will skip the dynamic type tables that are not supported by the data lake table. As an alternative, the -ConvertToString can be used to convert the dynamic type to a string type.

When creating the table first in Analytics and moving it via the Table Settings in Defender XDR – Dynamic is supported:) – The trick is; create the table first in Analytics and move it in the portal from Analytics to data lake.

Microsoft Docs: Tables with the Auxiliary plan don’t support columns with dynamic data. Source

Run the script via the Azure CLI. Before running, update the resourceID details in the script with the resourceID of the Log Analytics workspace. Since the datalake is created on top of Auxiliary, this can be used for creating custom tables.

So in short; Auxiliary/ Data lake tier is the same.

Enter the new table name, table type, and total retention period, shown below. Create the table first in the Analytics table type and not directly to Auxiliary. When creating the table directly in Auxiliary, the dynamic field is not created in the schema.

Now go to the table management in Defender XDR (and change the custom-created table from analytics to datalake only)

Change the tier from Analytics to data lake for the custom-created table.

As a result, the table (DeviceProcessEventsDL_CL) is created in the data lake type with the dynamic tables copied from the original table.

Data Collection rule (transform data)

Since the data lake table) has been created, the next step is to implement transformation logic at the data collection rule level.

Since we already created a custom table, we should create a transformation logic to move all data from the original DeviceProcessEvents table to the DeviceProcessEventsDL_CL table. Since there is no AMA agent, it is a bit different and can be used via Data collection rules.

Workspace transformation DCR

The workspace transformation data collection rule (DCR) is a special DCR that’s applied directly to a Log Analytics workspace. Since the Defender logs are not routed via the AMA agent, we need to create a workspace transformation DCR for the workspace.

In short; all tables that don’t use DCR for data ingestion can be managed via workspace transformation data collection rules

There can only be one transformation for each workspace, but it can include transformations for multiple tables. Via the Azure portal, we can easily create a workspace transformation rule:

More information: Create a transformation in Azure Monitor – Azure Monitor | Microsoft Learn

To create the transformation rule, go to the Log Analytics workspace and search for a default table, such as DeviceProcessEvents. Click on Edit transformation

Create the data collection rule; there can only be one transformation rule for the workspace So give it a general name and not focus only on the specific table in the name convention.

Now we need to change the content of the default transformation rule – this cannot be done via UI and needs to be performed via code. The data collection rule must be created with the Kind: WorkspaceTransforms. This can be validated via the Data Collection rules overview.

Open the DCR, click on Export template > Deploy > Edit Template as shown below:

Click on Edit template to change the template

In the dataFlows section, we need to change the transformations for each stream. As already mentioned, we can add all the transformations in one single file.

In the example below, the stream: Microsoft-Table-DeviceProcessEvents is moved to the destination (LAW) – with the output a custom table Custom-DeviceProcessEventsDL_CL

For custom tables, the name convention must start with Custom- in the template.

Note: The destination ID is different for each tenant.

"dataFlows": [
                    {
                        "streams": [
                            "Microsoft-Table-DeviceProcessEvents"
                        ],
                        "destinations": [
                            "0281239d889a430ba48eafac28f5284b"
                        ],
                        "outputStream": "Custom-DeviceProcessEventsDL_CL"
                    },

Example of a dataflow for multiple Defender XDR streams:

"dataFlows": [
                    {
                        "streams": [
                            "Microsoft-Table-DeviceProcessEvents"
                        ],
                        "destinations": [
                            "02331269d889a430bfdfeafac28fb284b"
                        ],
                        "outputStream": "Custom-DeviceProcessEventsDL_CL"
                    },
                    {
                        "streams": [
                            "Microsoft-Table-DeviceNetworkEvents"
                        ],
                        "destinations": [
                            "02331269d889a430bfdfeafac28fb284b"
                        ],
                        "outputStream": "Custom-DeviceNetworkEventsDL_CL"
                    },
                    {
                        "streams": [
                            "Microsoft-Table-CloudAppEvents"
                        ],
                        "destinations": [
                            "02331269d889a430bfdfeafac28fb284b"
                        ],
                        "outputStream": "Custom-CloudAppEventsDL_CL"
                    },
                    {
                        "streams": [
                            "Microsoft-Table-DeviceInfo"
                        ],
                        "destinations": [
                            "02331269d889a430bfdfeafac28fb284b"
                        ],
                        "outputStream": "Custom-DeviceInfoDL_CL"
                    }
                ]

Deploy the changed DCR configuration. When the changed file is deployed, the data is transformed directly into the custom tables, which means all data is routed from the original table to the custom table, which is in the cheaper data lake tier. All hunting activities and custom detections can be created on top of the XDR Advanced Hunting data, which is already “included” in the license. In general, it will avoid double costs for ingestion. Ensure that all detection rules are migrated from Log Analytics to Custom Detections in Defender XDR before stopping ingestion in Microsoft Sentinel.

This stream still requires the Defender XDR connector in Microsoft Sentinel to be enabled for sending logs. When the event categories are enabled, it will stream via the transformation rule in the custom-created tables.

Result

Original DeviceProcessEvents table empty:

Custom table filled with data via the transformation rule, and ingestion is based on the data lake tier and not the Analytics tier.


Source

Marko (LinkedIn): Sentinel Data Lake – what does it mean for your ingestion, transformations & retention?

Jeffrey Appel: Microsoft Sentinel Data Lake: How to use/enable and set-up the unified datalake

Microsoft: Microsoft Sentinel data lake overview (preview) – Microsoft Security | Microsoft Learn