Rory&
Rory&
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
though i probably should reorder the data to be chronological...
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
something seems a bit odd here ngl lol
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
honestly im pretty happy with that result
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
both depend on GetSerializedUnoptimisedKeysParallel as defined above
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
testing without filesystem cache: - chunked+threading approach: 13m27s - recursive with async (div2): 2m20s
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
implemented the merging recursively, its quite fast now but allocates a bunch of ReadStackFrames
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
hm, with the merge logic, that gives me 12m42s, and that doesnt even do the final merge
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
but that should be solved by using a decent scheduler at the OS kernel level
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
i'd assume it'd start thrasing on HDDs though...
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
so what, 6x faster?
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
37s without filesystem cache, if i replace the ConcurrentDictionary with a regular Dictionary + lock
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
so if im seeing it right, that's > 2x as fast
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
private async Task<List<string>> GetSerializedUnoptimisedKeysParallel(string start = "init") {
ConcurrentDictionary<string, string> pairs = [];
var unoptimisedKeys = (await storageProvider.GetAllKeysAsync()).Where(static x => !x.Contains('/')).ToFrozenSet();
await Parallel.ForEachAsync(unoptimisedKeys, async (key, _) => {
try {
var data = await storageProvider.LoadObjectAsync<SyncResponse>(key, SyncResponseSerializerContext.Default.SyncResponse);
if (data is null) return;
pairs.TryAdd(key, data.NextBatch);
}
catch (Exception e) {
Console.WriteLine($"Failed to read {key}:");
throw;
}
});

var serializedKeys = new List<string>();
var currentKey = start;
while (pairs.TryGetValue(currentKey, out var nextKey)) {
serializedKeys.Add(currentKey);
currentKey = nextKey;
}

return serializedKeys;
}

// Usage:
List<string> serialisedKeys = await GetSerializedUnoptimisedKeysParallel();

var chunkSize = serialisedKeys.Count / Environment.ProcessorCount;
var chunks = serialisedKeys.Chunk(chunkSize+1).Select(x => (x.First(), x.Length)).ToList();
Console.WriteLine($"Got {chunks.Count} chunks:");
foreach (var chunk in chunks) {
Console.WriteLine($"Chunk {chunk.Item1} with length {chunk.Length}");
}
private async Task<List<string>> GetSerializedUnoptimisedKeysParallel(string start = "init") {
ConcurrentDictionary<string, string> pairs = [];
var unoptimisedKeys = (await storageProvider.GetAllKeysAsync()).Where(static x => !x.Contains('/')).ToFrozenSet();
await Parallel.ForEachAsync(unoptimisedKeys, async (key, _) => {
try {
var data = await storageProvider.LoadObjectAsync<SyncResponse>(key, SyncResponseSerializerContext.Default.SyncResponse);
if (data is null) return;
pairs.TryAdd(key, data.NextBatch);
}
catch (Exception e) {
Console.WriteLine($"Failed to read {key}:");
throw;
}
});

var serializedKeys = new List<string>();
var currentKey = start;
while (pairs.TryGetValue(currentKey, out var nextKey)) {
serializedKeys.Add(currentKey);
currentKey = nextKey;
}

return serializedKeys;
}

// Usage:
List<string> serialisedKeys = await GetSerializedUnoptimisedKeysParallel();

var chunkSize = serialisedKeys.Count / Environment.ProcessorCount;
var chunks = serialisedKeys.Chunk(chunkSize+1).Select(x => (x.First(), x.Length)).ToList();
Console.WriteLine($"Got {chunks.Count} chunks:");
foreach (var chunk in chunks) {
Console.WriteLine($"Chunk {chunk.Item1} with length {chunk.Length}");
}
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
1m16s without filesystem cache
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
No description
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
lets try without FS cache
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
No description
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
result: 32s
46 replies
CC#
Created by Rory& on 4/7/2025 in #help
Optimising read-and-process loop
it was reading metadata files i was writing into subdirs because i was getting root level keys wrong
46 replies