109 lines
4.3 KiB
Markdown
Raw Permalink Normal View History

2026-01-06 22:59:58 +08:00
# Implementation Plan: Inspection Result Daily Deduplication Fix
## Overview
This implementation plan addresses the bug where daily deduplication of inspection results occurs after pagination instead of before. The fix involves refactoring the MongoDB aggregation pipeline in `SecondaryCircuitInspectionResultAppService.FindDatas` to perform deduplication before pagination.
## Tasks
- [x] 1. Create new aggregation pipeline method with deduplication
- Create `QueryPagedDeduplicatedResultsAcrossCollectionsAsync` method
- Implement MongoDB aggregation pipeline with $group stage for deduplication
- Use $facet to get both count and paginated results in single query
- Return tuple of (Results, TotalCount)
- _Requirements: 1.1, 1.2, 1.3, 3.1, 3.2, 3.3, 4.1, 4.2, 4.3_
- [ ]* 1.1 Write property test for deduplication completeness
- **Property 1: Deduplication Completeness**
- **Validates: Requirements 1.2**
- [x] 2. Refactor FindDatas method to use new pipeline
- Replace `QueryPagedResultsAcrossCollectionsAsync` call with new method
- Remove in-memory deduplication logic (lines 712-720)
- Use TotalCount from pipeline result instead of separate count query
- Update logging to reflect new approach
- _Requirements: 1.1, 1.4, 2.1, 2.2, 2.3_
- [ ]* 2.1 Write property test for pagination consistency
- **Property 2: Pagination Consistency**
- **Validates: Requirements 2.2**
- [ ]* 2.2 Write property test for count accuracy
- **Property 4: Count Accuracy**
- **Validates: Requirements 2.1**
- [x] 3. Implement grouping key construction
- Build $group stage with composite key (Year, Month, Day, ItemId, Status)
- Use $first operator to preserve first record after sorting
- Handle null Status values in grouping
- _Requirements: 1.2, 1.3_
- [ ]* 3.1 Write unit test for null status handling
- Test that null Status values are handled correctly in grouping
- _Requirements: 1.2_
- [x] 4. Implement $facet stage for count and pagination
- Create facet with two pipelines: totalCount and data
- Extract count from totalCount array
- Extract paginated results from data array
- Handle empty result sets correctly
- _Requirements: 2.1, 2.2, 3.4_
- [ ]* 4.1 Write unit test for empty result sets
- Test that empty results return TotalCount=0 and empty list
- _Requirements: 2.1_
- [x] 5. Update cross-collection query logic
- Ensure $unionWith stages are added before $group
- Verify consistent field projection across all collections
- Test deduplication across collection boundaries
- _Requirements: 4.1, 4.2, 4.3_
- [ ]* 5.1 Write property test for cross-collection consistency
- **Property 5: Cross-Collection Consistency**
- **Validates: Requirements 4.2**
- [ ] 6. Checkpoint - Ensure all tests pass
- Run all unit tests and property tests
- Verify query performance is acceptable
- Check that API response format is unchanged
- Ask the user if questions arise
- [x] 7. Add performance monitoring
- Add logging for deduplication operation timing
- Log before/after record counts
- Add warning logs if query time exceeds thresholds
- _Requirements: 5.2, 5.3, 5.4_
- [ ]* 7.1 Write unit test for performance logging
- Verify that performance metrics are logged correctly
- _Requirements: 5.3_
- [x] 8. Create database index recommendation
- Document compound index on (Year, Month, Day, ItemId, Status, ExecutionTime)
- Add index creation script to migration folder
- Document index purpose and benefits
- _Requirements: 5.1_
- [ ] 9. Update method documentation
- Update XML comments for FindDatas method
- Document the deduplication behavior
- Add remarks about performance characteristics
- _Requirements: 6.1, 6.2, 6.3, 6.4_
- [ ] 10. Final checkpoint - Integration testing
- Test with real data from multiple collections
- Verify backward compatibility with existing clients
- Confirm TotalCount accuracy across different scenarios
- Ensure all tests pass, ask the user if questions arise
## Notes
- Tasks marked with `*` are optional and can be skipped for faster MVP
- Each task references specific requirements for traceability
- Checkpoints ensure incremental validation
- Property tests validate universal correctness properties
- Unit tests validate specific examples and edge cases
- The fix is query-only, no data migration required
- Existing API contract is maintained for backward compatibility