Dataverse: The Importance of OrderBy When Retrieve more than 5k records
Lately, I've been playing with retrieving large sets of data from Dataverse. The total records that I need are more than 50k records ( if you are using FetchXML, you can get the method here). But somehow seems like Dataverse limits my total records and here is how we fixed it.
As usual, I generate the FetchXML using our beloved SQL4CDS:

Here is the code that I'm using to retrieve the data:
using DataverseClient;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.PowerPlatform.Dataverse.Client;
using Microsoft.Xrm.Sdk;
using Microsoft.Xrm.Sdk.Messages;
using Newtonsoft.Json;
using Microsoft.Xrm.Sdk.Query;
using System.Text;
using System.Xml;
var builder = Helper.CreateHostBuilder().Build();
var serviceProvider = builder.Services;
var service = serviceProvider.GetRequiredService<ServiceClient>();
var fetchCount = 5000;
var pageNumber = 1;
string pagingCookie = null;
var fetchXml = @"<fetch xmlns:generator='MarkMpn.SQL4CDS'>
<entity name='contact'>
<attribute name='firstname' />
<attribute name='lastname' />
<attribute name='contactid' />
<link-entity name='tmy_header' to='contactid' from='tmy_contact' alias='tmy_header' link-type='outer'>
<attribute name='tmy_name' />
<attribute name='tmy_headerid' />
<link-entity name='tmy_detail' to='tmy_headerid' from='tmy_header' alias='tmy_detail' link-type='outer'>
<attribute name='tmy_qty' />
<attribute name='tmy_price' />
<attribute name='tmy_detailid' />
<order attribute='tmy_detailid' />
</link-entity>
<order attribute='tmy_headerid' />
</link-entity>
<order attribute='contactid' />
</entity>
</fetch>";
var totalRecords = 0;
do
{
var xml = CreateXml(fetchXml, pagingCookie, pageNumber, fetchCount);
var result = service.RetrieveMultiple(new FetchExpression(xml));
totalRecords += result.Entities.Count;
if (result.MoreRecords)
{
pageNumber++;
pagingCookie = result.PagingCookie;
Console.WriteLine($"Paging cookie: {pagingCookie}. Page Number: {pageNumber}. Total Retrieved: {totalRecords}.");
}
else
{
Console.WriteLine("Retrieved: " + totalRecords);
break;
}
} while (true);
string CreateXml(string xml, string cookie, int page, int count)
{
StringReader stringReader = new StringReader(xml);
var reader = new XmlTextReader(stringReader);
// Load document
XmlDocument doc = new XmlDocument();
doc.Load(reader);
XmlAttributeCollection attrs = doc.DocumentElement.Attributes;
if (cookie != null)
{
XmlAttribute pagingAttr = doc.CreateAttribute("paging-cookie");
pagingAttr.Value = cookie;
attrs.Append(pagingAttr);
}
XmlAttribute pageAttr = doc.CreateAttribute("page");
pageAttr.Value = System.Convert.ToString(page);
attrs.Append(pageAttr);
XmlAttribute countAttr = doc.CreateAttribute("count");
countAttr.Value = System.Convert.ToString(count);
attrs.Append(countAttr);
StringBuilder sb = new StringBuilder(1024);
StringWriter stringWriter = new StringWriter(sb);
XmlTextWriter writer = new XmlTextWriter(stringWriter);
doc.WriteTo(writer);
writer.Close();
return sb.ToString();
}
Console.WriteLine("Total Records " + totalRecords);
Here is the screenshot when I retrieve using the above code:

As you can see in the above screenshot, we can only retrieve 50k records (while the data is more than 100k), and each paging cookie is empty. From documentation that you can read here, it stated that pagination and ordering are essential. Hence, I fix the FetchXml (removed the unnecessary order by) to below:
var fetchXml = @"<fetch xmlns:generator='MarkMpn.SQL4CDS'>
<entity name='contact'>
<attribute name='firstname' />
<attribute name='lastname' />
<attribute name='contactid' />
<link-entity name='tmy_header' to='contactid' from='tmy_contact' alias='tmy_header' link-type='outer'>
<attribute name='tmy_name' />
<attribute name='tmy_headerid' />
<link-entity name='tmy_detail' to='tmy_headerid' from='tmy_header' alias='tmy_detail' link-type='outer'>
<attribute name='tmy_qty' />
<attribute name='tmy_price' />
<attribute name='tmy_detailid' />
</link-entity>
</link-entity>
<order attribute='contactid' />
</entity>
</fetch>";
Once we fixed that, I can retrieve more than 50k already:

Happy CRM-ing! 😊
Leave a comment
Your comment is sent privately to the author and isn't published on the site.