Conversation
dmd79
pushed a commit
that referenced
this pull request
Aug 20, 2017
Otherwise, we can get mismatched largest extent information. One example is: 1. mount f2fs w/ extent_cache 2. make a small extent 3. umount 4. mount f2fs w/o extent_cache 5. update the largest extent 6. umount 7. mount f2fs w/ extent_cache 8. get the old extent made by #2 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 20, 2017
FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset at any time like below. Thread #1 Thread #2 - lock_page(ipage) - update i_fields - update i_size/i_blocks/and so on - set FI_DIRTY_INODE - reset FI_DIRTY_INODE - set_page_dirty(ipage) In this case, we can lose the latest i_field information. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 20, 2017
This patch enhances the xattr consistency of dirs from suddern power-cuts. Possible scenario would be: 1. dir->setxattr used by per-file encryption 2. file->setxattr goes into inline_xattr 3. file->fsync In that case, we should do checkpoint for #1. Otherwise we'd lose dir's key information for the file given #2. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 20, 2017
In the check_only mode, we should not make any dirty node pages. Otherwise, we can get this panic: F2FS-fs (nvme0n1p1): Need to recover fsync data ------------[ cut here ]------------ kernel BUG at fs/f2fs/node.c:2204! CPU: 7 PID: 19923 Comm: mount Tainted: G OE 4.9.8 #2 RIP: 0010:[<ffffffffc0979c0b>] [<ffffffffc0979c0b>] flush_nat_entries+0x43b/0x7d0 [f2fs] Call Trace: [<ffffffffc096ddaa>] ? __f2fs_submit_merged_bio+0x5a/0xd0 [f2fs] [<ffffffffc096ddaa>] ? __f2fs_submit_merged_bio+0x5a/0xd0 [f2fs] [<ffffffffc096dddb>] ? __f2fs_submit_merged_bio+0x8b/0xd0 [f2fs] [<ffffffff860e450f>] ? up_write+0x1f/0x40 [<ffffffffc096dddb>] ? __f2fs_submit_merged_bio+0x8b/0xd0 [f2fs] [<ffffffffc0969f04>] write_checkpoint+0x2f4/0xf20 [f2fs] [<ffffffff860e938d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc0960bc9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0960bc9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0960bd5>] f2fs_sync_fs+0x85/0x190 [f2fs] [<ffffffffc097b6de>] f2fs_balance_fs_bg+0x7e/0x1c0 [f2fs] [<ffffffffc0977b64>] f2fs_write_node_pages+0x34/0x350 [f2fs] [<ffffffff860e5f42>] ? __lock_is_held+0x52/0x70 [<ffffffff861d9b31>] do_writepages+0x21/0x30 [<ffffffff86298ce1>] __writeback_single_inode+0x61/0x760 [<ffffffff86909127>] ? _raw_spin_unlock+0x27/0x40 [<ffffffff8629a735>] writeback_single_inode+0xd5/0x190 [<ffffffff8629a889>] write_inode_now+0x99/0xc0 [<ffffffff86283876>] iput+0x1f6/0x2c0 [<ffffffffc0964b52>] f2fs_fill_super+0xc32/0x10c0 [f2fs] [<ffffffff86266462>] mount_bdev+0x182/0x1b0 [<ffffffffc0963f20>] ? f2fs_commit_super+0x100/0x100 [f2fs] [<ffffffffc0960da5>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffff86266e08>] mount_fs+0x38/0x170 [<ffffffff86288bab>] vfs_kern_mount+0x6b/0x160 [<ffffffff8628bcfe>] do_mount+0x1be/0xd60 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 21, 2017
Otherwise, we can get mismatched largest extent information. One example is: 1. mount f2fs w/ extent_cache 2. make a small extent 3. umount 4. mount f2fs w/o extent_cache 5. update the largest extent 6. umount 7. mount f2fs w/ extent_cache 8. get the old extent made by #2 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 21, 2017
FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset at any time like below. Thread #1 Thread #2 - lock_page(ipage) - update i_fields - update i_size/i_blocks/and so on - set FI_DIRTY_INODE - reset FI_DIRTY_INODE - set_page_dirty(ipage) In this case, we can lose the latest i_field information. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 21, 2017
This patch enhances the xattr consistency of dirs from suddern power-cuts. Possible scenario would be: 1. dir->setxattr used by per-file encryption 2. file->setxattr goes into inline_xattr 3. file->fsync In that case, we should do checkpoint for #1. Otherwise we'd lose dir's key information for the file given #2. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 21, 2017
In the check_only mode, we should not make any dirty node pages. Otherwise, we can get this panic: F2FS-fs (nvme0n1p1): Need to recover fsync data ------------[ cut here ]------------ kernel BUG at fs/f2fs/node.c:2204! CPU: 7 PID: 19923 Comm: mount Tainted: G OE 4.9.8 #2 RIP: 0010:[<ffffffffc0979c0b>] [<ffffffffc0979c0b>] flush_nat_entries+0x43b/0x7d0 [f2fs] Call Trace: [<ffffffffc096ddaa>] ? __f2fs_submit_merged_bio+0x5a/0xd0 [f2fs] [<ffffffffc096ddaa>] ? __f2fs_submit_merged_bio+0x5a/0xd0 [f2fs] [<ffffffffc096dddb>] ? __f2fs_submit_merged_bio+0x8b/0xd0 [f2fs] [<ffffffff860e450f>] ? up_write+0x1f/0x40 [<ffffffffc096dddb>] ? __f2fs_submit_merged_bio+0x8b/0xd0 [f2fs] [<ffffffffc0969f04>] write_checkpoint+0x2f4/0xf20 [f2fs] [<ffffffff860e938d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc0960bc9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0960bc9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0960bd5>] f2fs_sync_fs+0x85/0x190 [f2fs] [<ffffffffc097b6de>] f2fs_balance_fs_bg+0x7e/0x1c0 [f2fs] [<ffffffffc0977b64>] f2fs_write_node_pages+0x34/0x350 [f2fs] [<ffffffff860e5f42>] ? __lock_is_held+0x52/0x70 [<ffffffff861d9b31>] do_writepages+0x21/0x30 [<ffffffff86298ce1>] __writeback_single_inode+0x61/0x760 [<ffffffff86909127>] ? _raw_spin_unlock+0x27/0x40 [<ffffffff8629a735>] writeback_single_inode+0xd5/0x190 [<ffffffff8629a889>] write_inode_now+0x99/0xc0 [<ffffffff86283876>] iput+0x1f6/0x2c0 [<ffffffffc0964b52>] f2fs_fill_super+0xc32/0x10c0 [f2fs] [<ffffffff86266462>] mount_bdev+0x182/0x1b0 [<ffffffffc0963f20>] ? f2fs_commit_super+0x100/0x100 [f2fs] [<ffffffffc0960da5>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffff86266e08>] mount_fs+0x38/0x170 [<ffffffff86288bab>] vfs_kern_mount+0x6b/0x160 [<ffffffff8628bcfe>] do_mount+0x1be/0xd60 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 24, 2017
Otherwise, we can get mismatched largest extent information. One example is: 1. mount f2fs w/ extent_cache 2. make a small extent 3. umount 4. mount f2fs w/o extent_cache 5. update the largest extent 6. umount 7. mount f2fs w/ extent_cache 8. get the old extent made by #2 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 24, 2017
FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset at any time like below. Thread #1 Thread #2 - lock_page(ipage) - update i_fields - update i_size/i_blocks/and so on - set FI_DIRTY_INODE - reset FI_DIRTY_INODE - set_page_dirty(ipage) In this case, we can lose the latest i_field information. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 24, 2017
This patch enhances the xattr consistency of dirs from suddern power-cuts. Possible scenario would be: 1. dir->setxattr used by per-file encryption 2. file->setxattr goes into inline_xattr 3. file->fsync In that case, we should do checkpoint for #1. Otherwise we'd lose dir's key information for the file given #2. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Aug 24, 2017
In the check_only mode, we should not make any dirty node pages. Otherwise, we can get this panic: F2FS-fs (nvme0n1p1): Need to recover fsync data ------------[ cut here ]------------ kernel BUG at fs/f2fs/node.c:2204! CPU: 7 PID: 19923 Comm: mount Tainted: G OE 4.9.8 #2 RIP: 0010:[<ffffffffc0979c0b>] [<ffffffffc0979c0b>] flush_nat_entries+0x43b/0x7d0 [f2fs] Call Trace: [<ffffffffc096ddaa>] ? __f2fs_submit_merged_bio+0x5a/0xd0 [f2fs] [<ffffffffc096ddaa>] ? __f2fs_submit_merged_bio+0x5a/0xd0 [f2fs] [<ffffffffc096dddb>] ? __f2fs_submit_merged_bio+0x8b/0xd0 [f2fs] [<ffffffff860e450f>] ? up_write+0x1f/0x40 [<ffffffffc096dddb>] ? __f2fs_submit_merged_bio+0x8b/0xd0 [f2fs] [<ffffffffc0969f04>] write_checkpoint+0x2f4/0xf20 [f2fs] [<ffffffff860e938d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc0960bc9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0960bc9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0960bd5>] f2fs_sync_fs+0x85/0x190 [f2fs] [<ffffffffc097b6de>] f2fs_balance_fs_bg+0x7e/0x1c0 [f2fs] [<ffffffffc0977b64>] f2fs_write_node_pages+0x34/0x350 [f2fs] [<ffffffff860e5f42>] ? __lock_is_held+0x52/0x70 [<ffffffff861d9b31>] do_writepages+0x21/0x30 [<ffffffff86298ce1>] __writeback_single_inode+0x61/0x760 [<ffffffff86909127>] ? _raw_spin_unlock+0x27/0x40 [<ffffffff8629a735>] writeback_single_inode+0xd5/0x190 [<ffffffff8629a889>] write_inode_now+0x99/0xc0 [<ffffffff86283876>] iput+0x1f6/0x2c0 [<ffffffffc0964b52>] f2fs_fill_super+0xc32/0x10c0 [f2fs] [<ffffffff86266462>] mount_bdev+0x182/0x1b0 [<ffffffffc0963f20>] ? f2fs_commit_super+0x100/0x100 [f2fs] [<ffffffffc0960da5>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffff86266e08>] mount_fs+0x38/0x170 [<ffffffff86288bab>] vfs_kern_mount+0x6b/0x160 [<ffffffff8628bcfe>] do_mount+0x1be/0xd60 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Sep 21, 2017
Otherwise, we can get mismatched largest extent information. One example is: 1. mount f2fs w/ extent_cache 2. make a small extent 3. umount 4. mount f2fs w/o extent_cache 5. update the largest extent 6. umount 7. mount f2fs w/ extent_cache 8. get the old extent made by #2 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Sep 21, 2017
FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset at any time like below. Thread #1 Thread #2 - lock_page(ipage) - update i_fields - update i_size/i_blocks/and so on - set FI_DIRTY_INODE - reset FI_DIRTY_INODE - set_page_dirty(ipage) In this case, we can lose the latest i_field information. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Sep 21, 2017
This patch enhances the xattr consistency of dirs from suddern power-cuts. Possible scenario would be: 1. dir->setxattr used by per-file encryption 2. file->setxattr goes into inline_xattr 3. file->fsync In that case, we should do checkpoint for #1. Otherwise we'd lose dir's key information for the file given #2. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Sep 21, 2017
In the check_only mode, we should not make any dirty node pages. Otherwise, we can get this panic: F2FS-fs (nvme0n1p1): Need to recover fsync data ------------[ cut here ]------------ kernel BUG at fs/f2fs/node.c:2204! CPU: 7 PID: 19923 Comm: mount Tainted: G OE 4.9.8 #2 RIP: 0010:[<ffffffffc0979c0b>] [<ffffffffc0979c0b>] flush_nat_entries+0x43b/0x7d0 [f2fs] Call Trace: [<ffffffffc096ddaa>] ? __f2fs_submit_merged_bio+0x5a/0xd0 [f2fs] [<ffffffffc096ddaa>] ? __f2fs_submit_merged_bio+0x5a/0xd0 [f2fs] [<ffffffffc096dddb>] ? __f2fs_submit_merged_bio+0x8b/0xd0 [f2fs] [<ffffffff860e450f>] ? up_write+0x1f/0x40 [<ffffffffc096dddb>] ? __f2fs_submit_merged_bio+0x8b/0xd0 [f2fs] [<ffffffffc0969f04>] write_checkpoint+0x2f4/0xf20 [f2fs] [<ffffffff860e938d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc0960bc9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0960bc9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0960bd5>] f2fs_sync_fs+0x85/0x190 [f2fs] [<ffffffffc097b6de>] f2fs_balance_fs_bg+0x7e/0x1c0 [f2fs] [<ffffffffc0977b64>] f2fs_write_node_pages+0x34/0x350 [f2fs] [<ffffffff860e5f42>] ? __lock_is_held+0x52/0x70 [<ffffffff861d9b31>] do_writepages+0x21/0x30 [<ffffffff86298ce1>] __writeback_single_inode+0x61/0x760 [<ffffffff86909127>] ? _raw_spin_unlock+0x27/0x40 [<ffffffff8629a735>] writeback_single_inode+0xd5/0x190 [<ffffffff8629a889>] write_inode_now+0x99/0xc0 [<ffffffff86283876>] iput+0x1f6/0x2c0 [<ffffffffc0964b52>] f2fs_fill_super+0xc32/0x10c0 [f2fs] [<ffffffff86266462>] mount_bdev+0x182/0x1b0 [<ffffffffc0963f20>] ? f2fs_commit_super+0x100/0x100 [f2fs] [<ffffffffc0960da5>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffff86266e08>] mount_fs+0x38/0x170 [<ffffffff86288bab>] vfs_kern_mount+0x6b/0x160 [<ffffffff8628bcfe>] do_mount+0x1be/0xd60 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
dmd79
pushed a commit
that referenced
this pull request
Sep 21, 2017
(url: https://patchwork.kernel.org/patch/9857935/) When ->freeze_fs is called from lvm for doing snapshot, it needs to make sure there will be no more changes in filesystem's data, however, previously, background threads like GC thread wasn't aware of freezing, so in environment with active background threads, data of snapshot becomes unstable. This patch fixes this issue by adding sb_{start,end}_intwrite in below background threads: - GC thread - flush thread - discard thread Note that, don't use sb_start_intwrite() in gc_thread_func() due to: generic/241 reports below bug: ====================================================== WARNING: possible circular locking dependency detected 4.13.0-rc1+ #32 Tainted: G O ------------------------------------------------------ f2fs_gc-250:0/22186 is trying to acquire lock: (&sbi->gc_mutex){+.+...}, at: [<f8fa7f0b>] f2fs_sync_fs+0x7b/0x1b0 [f2fs] but task is already holding lock: (sb_internal#2){++++.-}, at: [<f8fb5609>] gc_thread_func+0x159/0x4a0 [f2fs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (sb_internal#2){++++.-}: __lock_acquire+0x405/0x7b0 lock_acquire+0xae/0x220 __sb_start_write+0x11d/0x1f0 f2fs_evict_inode+0x2d6/0x4e0 [f2fs] evict+0xa8/0x170 iput+0x1fb/0x2c0 f2fs_sync_inode_meta+0x3f/0xf0 [f2fs] write_checkpoint+0x1b1/0x750 [f2fs] f2fs_sync_fs+0x85/0x1b0 [f2fs] f2fs_do_sync_file.isra.24+0x137/0xa30 [f2fs] f2fs_sync_file+0x34/0x40 [f2fs] vfs_fsync_range+0x4a/0xa0 do_fsync+0x3c/0x60 SyS_fdatasync+0x15/0x20 do_fast_syscall_32+0xa1/0x1b0 entry_SYSENTER_32+0x4c/0x7b -> #1 (&sbi->cp_mutex){+.+...}: __lock_acquire+0x405/0x7b0 lock_acquire+0xae/0x220 __mutex_lock+0x4f/0x830 mutex_lock_nested+0x25/0x30 write_checkpoint+0x2f/0x750 [f2fs] f2fs_sync_fs+0x85/0x1b0 [f2fs] sync_filesystem+0x67/0x80 generic_shutdown_super+0x27/0x100 kill_block_super+0x22/0x50 kill_f2fs_super+0x3a/0x40 [f2fs] deactivate_locked_super+0x3d/0x70 deactivate_super+0x40/0x60 cleanup_mnt+0x39/0x70 __cleanup_mnt+0x10/0x20 task_work_run+0x69/0x80 exit_to_usermode_loop+0x57/0x92 do_fast_syscall_32+0x18c/0x1b0 entry_SYSENTER_32+0x4c/0x7b -> #0 (&sbi->gc_mutex){+.+...}: validate_chain.isra.36+0xc50/0xdb0 __lock_acquire+0x405/0x7b0 lock_acquire+0xae/0x220 __mutex_lock+0x4f/0x830 mutex_lock_nested+0x25/0x30 f2fs_sync_fs+0x7b/0x1b0 [f2fs] f2fs_balance_fs_bg+0xb9/0x200 [f2fs] gc_thread_func+0x302/0x4a0 [f2fs] kthread+0xe9/0x120 ret_from_fork+0x19/0x24 other info that might help us debug this: Chain exists of: &sbi->gc_mutex --> &sbi->cp_mutex --> sb_internal#2 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(sb_internal#2); lock(&sbi->cp_mutex); lock(sb_internal#2); lock(&sbi->gc_mutex); *** DEADLOCK *** 1 lock held by f2fs_gc-250:0/22186: #0: (sb_internal#2){++++.-}, at: [<f8fb5609>] gc_thread_func+0x159/0x4a0 [f2fs] stack backtrace: CPU: 2 PID: 22186 Comm: f2fs_gc-250:0 Tainted: G O 4.13.0-rc1+ #32 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Call Trace: dump_stack+0x5f/0x92 print_circular_bug+0x1b3/0x1bd validate_chain.isra.36+0xc50/0xdb0 ? __this_cpu_preempt_check+0xf/0x20 __lock_acquire+0x405/0x7b0 lock_acquire+0xae/0x220 ? f2fs_sync_fs+0x7b/0x1b0 [f2fs] __mutex_lock+0x4f/0x830 ? f2fs_sync_fs+0x7b/0x1b0 [f2fs] mutex_lock_nested+0x25/0x30 ? f2fs_sync_fs+0x7b/0x1b0 [f2fs] f2fs_sync_fs+0x7b/0x1b0 [f2fs] f2fs_balance_fs_bg+0xb9/0x200 [f2fs] gc_thread_func+0x302/0x4a0 [f2fs] ? preempt_schedule_common+0x2f/0x4d ? f2fs_gc+0x540/0x540 [f2fs] kthread+0xe9/0x120 ? f2fs_gc+0x540/0x540 [f2fs] ? kthread_create_on_node+0x30/0x30 ret_from_fork+0x19/0x24 The deadlock occurs in below condition: GC Thread Thread B - sb_start_intwrite - f2fs_sync_file - f2fs_sync_fs - mutex_lock(&sbi->gc_mutex) - write_checkpoint - block_operations - f2fs_sync_inode_meta - iput - sb_start_intwrite - mutex_lock(&sbi->gc_mutex) Fix this by altering sb_start_intwrite to sb_start_write_trylock. Change-Id: Ibea8cff73d684e5aebc950f29ef4d611fc10df76 Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Freshly checking out n7x-caf doesn't compile for me. Lots of bad include paths etc, I don't know. Please see comparison, all harmless and I suggest a merge in order to make it easier for other devs.