2026-01-06 22:59:58 +08:00

410 lines
15 KiB
Markdown

# Design Document
## Overview
This design replaces the MongoDB 4.4+ `$unionWith` operator with a MongoDB 3.x compatible approach. The solution queries each collection separately, merges results in application memory, performs deduplication using LINQ, and applies sorting and pagination. This maintains functional equivalence while supporting older MongoDB versions.
## Architecture
### Current Architecture (MongoDB 4.4+)
```
┌─────────────────────────────────────────────────────────┐
│ QueryPagedDeduplicatedResultsAcrossCollectionsAsync │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Single Aggregation Pipeline with $unionWith │
│ - $match (filter) │
│ - $project (fields) │
│ - $unionWith (merge collections) │
│ - $sort │
│ - $group (deduplication) │
│ - $replaceRoot │
│ - $sort (final) │
│ - $facet (count + pagination) │
└─────────────────────────────────────────────────────────┘
MongoDB Server
```
### New Architecture (MongoDB 3.x Compatible)
```
┌─────────────────────────────────────────────────────────┐
│ QueryPagedDeduplicatedResultsAcrossCollectionsAsync │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Parallel Collection Queries │
│ For each collection: │
│ - Find with filter │
│ - Project fields │
│ - Sort by sortField │
│ - ToListAsync │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ In-Memory Processing (LINQ) │
│ - Merge all results │
│ - Sort by sortField │
│ - GroupBy (Year, Month, Day, ItemId, Status) │
│ - Select first from each group │
│ - Sort again (final order) │
│ - Count total │
│ - Skip + Take (pagination) │
└─────────────────────────────────────────────────────────┘
Return (results, count)
```
## Components and Interfaces
### Modified Method: QueryPagedDeduplicatedResultsAcrossCollectionsAsync
**Signature:**
```csharp
private async Task<(List<SecondaryCircuitInspectionResult> Results, long TotalCount)>
QueryPagedDeduplicatedResultsAcrossCollectionsAsync(
List<string> collectionNames,
FilterDefinition<SecondaryCircuitInspectionResult> filter,
string sortField,
bool isDescending,
int skipCount,
int pageSize,
CancellationToken cancellationToken = default)
```
**Algorithm:**
1. **Input Validation**
- Check if collectionNames is null or empty
- Return empty results if no collections
2. **Prepare Query Components**
- Render filter to BsonDocument
- Build projection document
- Create sort definition
3. **Query Each Collection in Parallel**
```csharp
var tasks = collectionNames.Select(async collectionName => {
var collection = GetCollection<SecondaryCircuitInspectionResult>(collectionName);
var query = collection.Find(filter);
if (!string.IsNullOrWhiteSpace(sortField)) {
query = isDescending
? query.SortByDescending(x => GetPropertyValue(x, sortField))
: query.SortBy(x => GetPropertyValue(x, sortField));
}
return await query
.Project<SecondaryCircuitInspectionResult>(projection)
.ToListAsync(cancellationToken);
});
var collectionResults = await Task.WhenAll(tasks);
```
4. **Merge Results**
```csharp
var allResults = collectionResults.SelectMany(x => x).ToList();
```
5. **Sort Before Deduplication**
```csharp
if (!string.IsNullOrWhiteSpace(sortField)) {
allResults = isDescending
? allResults.OrderByDescending(x => GetPropertyValue(x, sortField)).ToList()
: allResults.OrderBy(x => GetPropertyValue(x, sortField)).ToList();
}
```
6. **Deduplication**
```csharp
var deduplicatedResults = allResults
.GroupBy(x => new {
x.Year,
x.Month,
x.Day,
x.SecondaryCircuitInspectionItemId,
Status = x.Status ?? string.Empty
})
.Select(g => g.First())
.ToList();
```
7. **Final Sort**
```csharp
if (!string.IsNullOrWhiteSpace(sortField)) {
deduplicatedResults = isDescending
? deduplicatedResults.OrderByDescending(x => GetPropertyValue(x, sortField)).ToList()
: deduplicatedResults.OrderBy(x => GetPropertyValue(x, sortField)).ToList();
}
```
8. **Count and Paginate**
```csharp
var totalCount = deduplicatedResults.Count;
var paginatedResults = deduplicatedResults
.Skip(skipCount)
.Take(pageSize)
.ToList();
return (paginatedResults, totalCount);
```
### Helper Method: GetPropertyValue
**Purpose:** Dynamically retrieve property values for sorting
**Signature:**
```csharp
private object GetPropertyValue(SecondaryCircuitInspectionResult obj, string propertyName)
```
**Implementation:**
```csharp
private object GetPropertyValue(SecondaryCircuitInspectionResult obj, string propertyName)
{
var property = typeof(SecondaryCircuitInspectionResult).GetProperty(propertyName);
if (property == null)
{
Log4Helper.Warning($"Property '{propertyName}' not found on SecondaryCircuitInspectionResult");
return null;
}
return property.GetValue(obj);
}
```
## Data Models
### SecondaryCircuitInspectionResult (Existing)
Key fields used in deduplication and sorting:
- `Year` (int): Year component of execution time
- `Month` (int): Month component of execution time
- `Day` (int): Day component of execution time
- `SecondaryCircuitInspectionItemId` (Guid): Inspection item identifier
- `Status` (string, nullable): Inspection status
- `ExecutionTime` (DateTime): When inspection was executed
- `Id` (string): MongoDB document ID
### Deduplication Key
Anonymous type used for grouping:
```csharp
new {
Year,
Month,
Day,
SecondaryCircuitInspectionItemId,
Status = Status ?? string.Empty // Handle null
}
```
## Data Models
### Query Result Structure
```csharp
(List<SecondaryCircuitInspectionResult> Results, long TotalCount)
```
- `Results`: Paginated list of deduplicated inspection results
- `TotalCount`: Total number of deduplicated results (before pagination)
## Correctness Properties
*A property is a characteristic or behavior that should hold true across all valid executions of a system—essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.*
### Property 1: Deduplication Removes Duplicates by Key Fields
*For any* list of inspection results, when deduplication is applied, the output should contain at most one result for each unique combination of (Year, Month, Day, SecondaryCircuitInspectionItemId, Status).
**Validates: Requirements 1.4, 3.1**
### Property 2: Sorting Produces Correct Order
*For any* list of inspection results and any valid sort field, when sorting is applied in ascending order, each result should have a sort field value less than or equal to the next result's value. When sorting in descending order, each result should have a sort field value greater than or equal to the next result's value.
**Validates: Requirements 1.5**
### Property 3: First Record Selection After Sorting
*For any* group of inspection results with the same deduplication key (Year, Month, Day, SecondaryCircuitInspectionItemId, Status), when sorted by a field and deduplicated, the selected result should be the one with the minimum (for ascending) or maximum (for descending) value of the sort field within that group.
**Validates: Requirements 3.2**
### Property 4: Pagination Returns Correct Slice
*For any* deduplicated and sorted list of inspection results, when pagination is applied with skip count N and page size M, the returned results should be exactly the elements from index N to index N+M-1 (or end of list if shorter) from the sorted list.
**Validates: Requirements 3.3**
### Property 5: Total Count Matches Deduplicated Count
*For any* list of inspection results, the total count returned should equal the number of unique combinations of (Year, Month, Day, SecondaryCircuitInspectionItemId, Status) in the input, regardless of pagination parameters.
**Validates: Requirements 3.4**
## Error Handling
### Collection Query Failures
**Strategy:** Resilient querying with partial failure handling
- Each collection query is wrapped in try-catch
- Failed collection queries are logged with collection name and error details
- Successful collection results are still processed
- If all collections fail, return empty results with error logged
**Implementation:**
```csharp
var tasks = collectionNames.Select(async collectionName => {
try {
var collection = GetCollection<SecondaryCircuitInspectionResult>(collectionName);
// ... query logic ...
return await query.ToListAsync(cancellationToken);
}
catch (Exception ex) {
Log4Helper.Error($"Failed to query collection {collectionName}: {ex.Message}", ex);
return new List<SecondaryCircuitInspectionResult>();
}
});
```
### Null Status Handling
**Strategy:** Treat null as empty string for grouping
- Null Status values are converted to empty string in the grouping key
- This ensures consistent deduplication behavior
- Matches the original MongoDB implementation's `$ifNull` behavior
**Implementation:**
```csharp
.GroupBy(x => new {
x.Year,
x.Month,
x.Day,
x.SecondaryCircuitInspectionItemId,
Status = x.Status ?? string.Empty // Null becomes empty string
})
```
### Invalid Sort Field
**Strategy:** Log warning and continue without sorting
- If sort field doesn't exist on the entity, log a warning
- Return null from GetPropertyValue helper
- LINQ OrderBy will handle null values gracefully
### Cancellation Support
**Strategy:** Pass CancellationToken through all async operations
- CancellationToken is passed to all MongoDB queries
- Task.WhenAll respects cancellation
- If cancelled, OperationCanceledException is thrown and propagated
### Empty Collection List
**Strategy:** Early return with empty results
```csharp
if (collectionNames == null || collectionNames.Count == 0)
{
return (new List<SecondaryCircuitInspectionResult>(), 0);
}
```
## Testing Strategy
### Dual Testing Approach
This feature requires both unit tests and property-based tests:
- **Unit tests**: Verify specific examples, edge cases, and error conditions
- **Property tests**: Verify universal properties across all inputs
- Both are complementary and necessary for comprehensive coverage
### Property-Based Testing
**Framework:** Use **FsCheck** for C# property-based testing (or **CsCheck** as an alternative)
**Configuration:**
- Minimum 100 iterations per property test
- Each test must reference its design document property
- Tag format: **Feature: mongodb-compatibility-fix, Property {number}: {property_text}**
**Test Structure:**
Each correctness property will be implemented as a single property-based test:
1. **Property 1 Test**: Generate random lists of inspection results with intentional duplicates, verify deduplication
2. **Property 2 Test**: Generate random lists and sort fields, verify sort order
3. **Property 3 Test**: Generate groups with multiple records, verify first selection
4. **Property 4 Test**: Generate random skip/limit values, verify correct slice
5. **Property 5 Test**: Generate random lists, verify count accuracy
**Generators:**
Custom generators needed:
- `InspectionResultGenerator`: Generates random SecondaryCircuitInspectionResult objects
- `DuplicateGroupGenerator`: Generates groups of results with same deduplication key
- `SortFieldGenerator`: Generates valid property names for sorting
- `PaginationParamsGenerator`: Generates valid skip/limit combinations
**Edge Cases to Include:**
- Null Status values
- Empty collection lists
- Single collection
- Large skip values (beyond result count)
- Zero page size
- Results with identical sort field values
### Unit Testing
**Focus Areas:**
1. **Specific Examples**
- Query with 3 collections, verify merge
- Deduplication with known duplicates
- Pagination edge cases (first page, last page, beyond end)
2. **Error Conditions**
- All collections fail to query
- Invalid sort field name
- Cancellation during query
- Null or empty collection list
3. **Integration Points**
- MongoDB query execution
- Filter rendering
- Projection building
**Test Organization:**
- Create `SecondaryCircuitInspectionResultAppServiceTests.cs` in test project
- Group tests by functionality (deduplication, sorting, pagination, error handling)
- Use descriptive test names: `QueryPagedDeduplicatedResults_WithDuplicates_RemovesDuplicates`
### Performance Testing
While not part of correctness properties, performance should be monitored:
- Measure query time for 1, 6, and 12 collections
- Verify performance warnings are logged when exceeding thresholds
- Compare performance with original $unionWith implementation (on MongoDB 4.4+)
**Performance Expectations:**
- 1-3 collections: < 500ms
- 4-6 collections: < 1000ms
- 7-12 collections: < 2000ms
- Warning threshold: 2000ms