# Implementation Plan: Inspection Result Daily Deduplication Fix ## Overview This implementation plan addresses the bug where daily deduplication of inspection results occurs after pagination instead of before. The fix involves refactoring the MongoDB aggregation pipeline in `SecondaryCircuitInspectionResultAppService.FindDatas` to perform deduplication before pagination. ## Tasks - [x] 1. Create new aggregation pipeline method with deduplication - Create `QueryPagedDeduplicatedResultsAcrossCollectionsAsync` method - Implement MongoDB aggregation pipeline with $group stage for deduplication - Use $facet to get both count and paginated results in single query - Return tuple of (Results, TotalCount) - _Requirements: 1.1, 1.2, 1.3, 3.1, 3.2, 3.3, 4.1, 4.2, 4.3_ - [ ]* 1.1 Write property test for deduplication completeness - **Property 1: Deduplication Completeness** - **Validates: Requirements 1.2** - [x] 2. Refactor FindDatas method to use new pipeline - Replace `QueryPagedResultsAcrossCollectionsAsync` call with new method - Remove in-memory deduplication logic (lines 712-720) - Use TotalCount from pipeline result instead of separate count query - Update logging to reflect new approach - _Requirements: 1.1, 1.4, 2.1, 2.2, 2.3_ - [ ]* 2.1 Write property test for pagination consistency - **Property 2: Pagination Consistency** - **Validates: Requirements 2.2** - [ ]* 2.2 Write property test for count accuracy - **Property 4: Count Accuracy** - **Validates: Requirements 2.1** - [x] 3. Implement grouping key construction - Build $group stage with composite key (Year, Month, Day, ItemId, Status) - Use $first operator to preserve first record after sorting - Handle null Status values in grouping - _Requirements: 1.2, 1.3_ - [ ]* 3.1 Write unit test for null status handling - Test that null Status values are handled correctly in grouping - _Requirements: 1.2_ - [x] 4. Implement $facet stage for count and pagination - Create facet with two pipelines: totalCount and data - Extract count from totalCount array - Extract paginated results from data array - Handle empty result sets correctly - _Requirements: 2.1, 2.2, 3.4_ - [ ]* 4.1 Write unit test for empty result sets - Test that empty results return TotalCount=0 and empty list - _Requirements: 2.1_ - [x] 5. Update cross-collection query logic - Ensure $unionWith stages are added before $group - Verify consistent field projection across all collections - Test deduplication across collection boundaries - _Requirements: 4.1, 4.2, 4.3_ - [ ]* 5.1 Write property test for cross-collection consistency - **Property 5: Cross-Collection Consistency** - **Validates: Requirements 4.2** - [ ] 6. Checkpoint - Ensure all tests pass - Run all unit tests and property tests - Verify query performance is acceptable - Check that API response format is unchanged - Ask the user if questions arise - [x] 7. Add performance monitoring - Add logging for deduplication operation timing - Log before/after record counts - Add warning logs if query time exceeds thresholds - _Requirements: 5.2, 5.3, 5.4_ - [ ]* 7.1 Write unit test for performance logging - Verify that performance metrics are logged correctly - _Requirements: 5.3_ - [x] 8. Create database index recommendation - Document compound index on (Year, Month, Day, ItemId, Status, ExecutionTime) - Add index creation script to migration folder - Document index purpose and benefits - _Requirements: 5.1_ - [ ] 9. Update method documentation - Update XML comments for FindDatas method - Document the deduplication behavior - Add remarks about performance characteristics - _Requirements: 6.1, 6.2, 6.3, 6.4_ - [ ] 10. Final checkpoint - Integration testing - Test with real data from multiple collections - Verify backward compatibility with existing clients - Confirm TotalCount accuracy across different scenarios - Ensure all tests pass, ask the user if questions arise ## Notes - Tasks marked with `*` are optional and can be skipped for faster MVP - Each task references specific requirements for traceability - Checkpoints ensure incremental validation - Property tests validate universal correctness properties - Unit tests validate specific examples and edge cases - The fix is query-only, no data migration required - Existing API contract is maintained for backward compatibility