Page 1 of 1

PAX: size overflow detected in function drbd_send_bitmap

PostPosted: Tue Aug 09, 2016 11:40 am
by rot
Hi,

I have problem with drbd on gentoo-hardened, kernel versions 4.4.8-hardened-r1 and 4.6.5-hardened-r1.


[ 48.687425] drbd drbd_221: Starting worker thread (from drbdsetup-84 [2086])
[ 48.701378] block drbd221: disk( Diskless -> Attaching )
[ 48.706554] drbd drbd_221: Method to ensure write ordering: flush
[ 48.706558] block drbd221: max BIO size = 131072
[ 48.706569] block drbd221: drbd_bm_resize called with capacity == 949468800
[ 48.719252] block drbd221: resync bitmap: bits=118683600 words=1854432 pages=3622
[ 48.719260] block drbd221: size = 453 GB (474734400 KB)
[ 48.827657] block drbd221: recounting of set bits took additional 12 jiffies
[ 48.827664] block drbd221: 1048 MB (268288 bits) marked out-of-sync by on disk bit-map.
[ 48.827676] block drbd221: disk( Attaching -> UpToDate )
[ 48.827681] block drbd221: attached to UUIDs 1CFE6629346C774C:0000000000000000:6521C977FB0CAB0F:6520C977FB0CAB0F
[ 48.892490] drbd drbd_222: Starting worker thread (from drbdsetup-84 [2092])
[ 48.893302] block drbd222: disk( Diskless -> Attaching )
[ 48.893495] drbd drbd_222: Method to ensure write ordering: flush
[ 48.893499] block drbd222: max BIO size = 131072
[ 48.893507] block drbd222: drbd_bm_resize called with capacity == 949468800
[ 48.911321] block drbd222: resync bitmap: bits=118683600 words=1854432 pages=3622
[ 48.911330] block drbd222: size = 453 GB (474734400 KB)
[ 49.001425] block drbd222: recounting of set bits took additional 11 jiffies
[ 49.001432] block drbd222: 353 GB (92510160 bits) marked out-of-sync by on disk bit-map.
[ 49.001453] block drbd222: disk( Attaching -> Inconsistent )
[ 49.001458] block drbd222: attached to UUIDs 0002000000000004:0000000000000000:886DC0F7ABE4AB02:0000000000000000
[ 49.010862] drbd drbd_221: conn( StandAlone -> Unconnected )
[ 49.010981] drbd drbd_221: Starting receiver thread (from drbd_w_drbd_221 [2087])
[ 49.011459] drbd drbd_221: receiver (re)started
[ 49.011499] drbd drbd_221: conn( Unconnected -> WFConnection )
[ 49.021504] drbd drbd_222: conn( StandAlone -> Unconnected )
[ 49.021546] drbd drbd_222: Starting receiver thread (from drbd_w_drbd_222 [2094])
[ 49.022262] drbd drbd_222: receiver (re)started
[ 49.022297] drbd drbd_222: conn( Unconnected -> WFConnection )
[ 49.571570] drbd drbd_221: initial packet S crossed
[ 49.571712] drbd drbd_222: initial packet S crossed
[ 59.143277] drbd drbd_221: Handshake successful: Agreed network protocol version 101
[ 59.143284] drbd drbd_221: Agreed to support TRIM on protocol level
[ 59.143339] drbd drbd_221: conn( WFConnection -> WFReportParams )
[ 59.143343] drbd drbd_221: Starting asender thread (from drbd_r_drbd_221 [2098])
[ 59.157149] block drbd221: drbd_sync_handshake:
[ 59.157157] block drbd221: self 1CFE6629346C774C:0000000000000000:6521C977FB0CAB0F:6520C977FB0CAB0F bits:268288 flags:0
[ 59.157161] block drbd221: peer 1CFE6629346C774C:0000000000000000:6521C977FB0CAB0E:6520C977FB0CAB0F bits:0 flags:0
[ 59.157164] block drbd221: uuid_compare()=1 by rule 40
[ 59.157175] block drbd221: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
[ 59.157306] PAX: size overflow detected in function drbd_send_bitmap drivers/block/drbd/drbd_main.c:1072 cicus.1244_488 min, count: 102, decl: encoding; num: 0; context: p_compressed_bm;
[ 59.158357] CPU: 0 PID: 2087 Comm: drbd_w_drbd_221 Not tainted 4.4.8-hardened-r1-fx #2
[ 59.158360] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 59.158363] 0000000000000000 ffffc90000833cc0 ffffffff813ef95f 0000000000000001
[ 59.158367] ffffffffa0052918 0000000000000430 ffffc90000833cf0 ffffffff8116134e
[ 59.158370] ffff88007f976011 ffff88007f976010 0000000000000000 ffff8800bba3a000
[ 59.158374] Call Trace:
[ 59.158420] [<ffffffff813ef95f>] dump_stack+0x4e/0x7f
[ 59.158435] [<ffffffffa0052918>] ? put_ldev+0x8001/0x15628 [drbd]
[ 59.158458] [<ffffffff8116134e>] report_size_overflow+0x6e/0x80
[ 59.158466] [<ffffffffa00374da>] drbd_send_bitmap+0x68a/0xb80 [drbd]
[ 59.158470] [<ffffffff8114406d>] ? kfree+0x1d/0x100
[ 59.158477] [<ffffffffa003005b>] w_bitmap_io+0x11b/0x350 [drbd]
[ 59.158483] [<ffffffffa0019649>] drbd_worker+0xe9/0x3a0 [drbd]
[ 59.158490] [<ffffffffa002f930>] ? drbd_destroy_connection+0xf0/0xf0 [drbd]
[ 59.158496] [<ffffffffa002f988>] drbd_thread_setup+0x58/0x170 [drbd]
[ 59.158502] [<ffffffffa002f930>] ? drbd_destroy_connection+0xf0/0xf0 [drbd]
[ 59.158519] [<ffffffff81072415>] kthread+0xd5/0xf0
[ 59.158523] [<ffffffff81072340>] ? kthread_create_on_node+0x170/0x170
[ 59.158555] [<ffffffff817b3bfe>] ret_from_fork+0x3e/0x70
[ 59.158559] [<ffffffff81072340>] ? kthread_create_on_node+0x170/0x170
[ 61.500228] drbd drbd_222: initial packet M crossed
[ 62.501253] drbd drbd_222: Handshake successful: Agreed network protocol version 101
[ 62.501258] drbd drbd_222: Agreed to support TRIM on protocol level
[ 62.501316] drbd drbd_222: conn( WFConnection -> WFReportParams )
[ 62.501328] drbd drbd_222: Starting asender thread (from drbd_r_drbd_222 [2100])
[ 62.515144] block drbd222: drbd_sync_handshake:
[ 62.515150] block drbd222: self 0002000000000004:0000000000000000:886DC0F7ABE4AB02:0000000000000000 bits:92510160 flags:0
[ 62.515154] block drbd222: peer 96FF4A6DA7C16B9F:0002000000000005:0001000000000005:0000000000000005 bits:92731314 flags:0
[ 62.515158] block drbd222: uuid_compare()=-1 by rule 50
[ 62.515161] block drbd222: Becoming sync target due to disk states.
[ 62.515168] block drbd222: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
[ 62.785144] block drbd222: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 65(1), total 65; compression: 100.0%
[ 62.790408] block drbd222: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 65(1), total 65; compression: 100.0%
[ 62.790422] block drbd222: conn( WFBitMapT -> WFSyncUUID )
[ 62.826388] block drbd222: updated sync uuid 0003000000000004:0000000000000000:886DC0F7ABE4AB02:0000000000000000
[ 62.827356] block drbd222: helper command: /sbin/drbdadm before-resync-target minor-222
[ 62.839333] block drbd222: helper command: /sbin/drbdadm before-resync-target minor-222 exit code 0 (0x0)
[ 62.839354] block drbd222: conn( WFSyncUUID -> SyncTarget )
[ 62.839375] block drbd222: Began resync as SyncTarget (will sync 370965192 KB [92741298 bits set]).

Re: PAX: size overflow detected in function drbd_send_bitmap

PostPosted: Thu Sep 01, 2016 2:19 pm
by PaX Team
the drbd size overflow reports should be fixed in newer kernels, can you try them?

Re: PAX: size overflow detected in function drbd_send_bitmap

PostPosted: Fri Sep 02, 2016 7:46 am
by br1
Hi,
I have the same problem with gentoo-hardened, versions 4.7.2-hardened and 4.7.2-hardened-r1.
The error appear on initial volum synchronization, just after the "drbdadm -- --overwrite-data-of-peer primary drbd_MyVolume" is issued on primary node.

Re: PAX: size overflow detected in function drbd_send_bitmap

PostPosted: Fri Sep 02, 2016 8:57 am
by rot
Newest i tried is hardened-sources-4.7.0-r1
Which kernel version should I try.

Re: PAX: size overflow detected in function drbd_send_bitmap

PostPosted: Fri Sep 02, 2016 10:29 am
by PaX Team
4.7.2 is newest, can you try that and post dmesg if something still triggers the size overflow check?

Re: PAX: size overflow detected in function drbd_send_bitmap

PostPosted: Fri Sep 02, 2016 1:08 pm
by br1
dmesg output with linux-4.7.2-hardened-r1:

[ 2323.651892] DLM installed
[ 2412.086620] OCFS2 User DLM kernel interface loaded
[ 2647.890707] drbd: failed to initialize debugfs -- will not be available
[ 2647.890716] drbd: initialized. Version: 8.4.6 (api:1/proto:86-101)
[ 2647.890719] drbd: srcversion: 9374C77DCDE8659EEBA696C
[ 2647.890721] drbd: registered as block device major 147
[ 2652.289255] drbd drbd_32c: Starting worker thread (from drbdsetup [8245])
[ 2652.295152] block drbd1: disk( Diskless -> Attaching )
[ 2652.295357] drbd drbd_32c: Method to ensure write ordering: flush
[ 2652.295386] block drbd1: max BIO size = 4096
[ 2652.295400] block drbd1: drbd_bm_resize called with capacity == 4194104
[ 2652.295446] block drbd1: resync bitmap: bits=524263 words=8192 pages=16
[ 2652.295453] block drbd1: size = 2048 MB (2097052 KB)
[ 2652.310156] block drbd1: Writing the whole bitmap, size changed
[ 2652.313053] block drbd1: bitmap WRITE of 16 pages took 3 jiffies
[ 2652.313063] block drbd1: 2048 MB (524263 bits) marked out-of-sync by on disk bit-map.
[ 2652.317234] block drbd1: recounting of set bits took additional 0 jiffies
[ 2652.317241] block drbd1: 2048 MB (524263 bits) marked out-of-sync by on disk bit-map.
[ 2652.317273] block drbd1: Suspended AL updates
[ 2652.317281] block drbd1: disk( Attaching -> Inconsistent )
[ 2652.317324] block drbd1: attached to UUIDs 0000000000000004:0000000000000000:0000000000000000:0000000000000000
[ 2652.323637] drbd drbd_32c: conn( StandAlone -> Unconnected )
[ 2652.323734] drbd drbd_32c: Starting receiver thread (from drbd_w_drbd_32c [8247])
[ 2652.323856] drbd drbd_32c: receiver (re)started
[ 2652.323987] drbd drbd_32c: conn( Unconnected -> WFConnection )
[ 2655.327576] drbd drbd_32c: initial packet S crossed
[ 2656.327267] drbd drbd_32c: Handshake successful: Agreed network protocol version 101
[ 2656.327273] drbd drbd_32c: Agreed to support TRIM on protocol level
[ 2656.327444] drbd drbd_32c: conn( WFConnection -> WFReportParams )
[ 2656.327468] drbd drbd_32c: Starting ack_recv thread (from drbd_r_drbd_32c [8250])
[ 2656.338180] block drbd1: max BIO size = 1048576
[ 2656.338204] block drbd1: drbd_sync_handshake:
[ 2656.338211] block drbd1: self 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:524263 flags:0
[ 2656.338216] block drbd1: peer 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:524263 flags:0
[ 2656.338221] block drbd1: uuid_compare()=0 by rule 10
[ 2656.338226] block drbd1: No resync, but 524263 bits in bitmap!
[ 2656.338238] block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> Inconsistent )
[ 2656.338308] block drbd1: Resumed AL updates
[ 2761.090710] block drbd1: role( Secondary -> Primary ) disk( Inconsistent -> UpToDate )
[ 2761.091679] block drbd1: Forced to consider local data as UpToDate!
[ 2761.091696] block drbd1: new current UUID 6557D2C9C8D33AC3:0000000000000004:0000000000000000:0000000000000000
[ 2761.112327] block drbd1: drbd_sync_handshake:
[ 2761.112336] block drbd1: self 6557D2C9C8D33AC3:0000000000000004:0000000000000000:0000000000000000 bits:524263 flags:0
[ 2761.112341] block drbd1: peer 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:524263 flags:0
[ 2761.112344] block drbd1: uuid_compare()=2 by rule 30
[ 2761.112347] block drbd1: Becoming sync source due to disk states.
[ 2761.112351] block drbd1: Writing the whole bitmap, full sync required after drbd_sync_handshake.
[ 2761.115268] block drbd1: bitmap WRITE of 16 pages took 2 jiffies
[ 2761.115276] block drbd1: 2048 MB (524263 bits) marked out-of-sync by on disk bit-map.
[ 2761.116882] block drbd1: conn( Connected -> WFBitMapS )
[ 2761.116987] PAX: size overflow detected in function drbd_send_bitmap drivers/block/drbd/drbd_main.c:1073 cicus.1222_452 min, count: 152, decl: encoding; num: 0; context: p_compressed_bm;
[ 2761.117216] CPU: 0 PID: 8247 Comm: drbd_w_drbd_32c Not tainted 4.7.2-hardened-r1 #1
[ 2761.117219] Hardware name: /D2700DC , BIOS MUCDT10N.86A.0076.2013.0618.1644 06/18/2013
[ 2761.117223] 0000000000000000 0000000000000286 0000000000000000 ffffffff812538bb
[ 2761.117231] ffff88012fc0f108 ffffffffa06e18f0 0000000000000431 ffffffff811632d5
[ 2761.117237] 0000000000000000 ffffc900003bbe10 ffff8800c913d011 00000000ffffff80
[ 2761.117243] Call Trace:
[ 2761.117257] [<ffffffff812538bb>] ? dump_stack+0x47/0x7c
[ 2761.117280] [<ffffffffa06e18f0>] ? drbd_nla_find_nested+0x79b0/0x183cd [drbd]
[ 2761.117292] [<ffffffff811632d5>] ? report_size_overflow+0x65/0x90
[ 2761.117309] [<ffffffffa06bfdf1>] ? drbd_send_bitmap+0x5e1/0xc10 [drbd]
[ 2761.117316] [<ffffffff81154b6d>] ? kfree+0xcd/0x1a0
[ 2761.117333] [<ffffffffa06b83db>] ? w_bitmap_io+0x7b/0x3b0 [drbd]
[ 2761.117350] [<ffffffffa06a0655>] ? drbd_worker+0x105/0x410 [drbd]
[ 2761.117368] [<ffffffffa06b7ab0>] ? drbd_rs_del_all+0x4d0/0x4d0 [drbd]
[ 2761.117390] [<ffffffffa06b7b05>] ? drbd_thread_setup+0x55/0x190 [drbd]
[ 2761.117409] [<ffffffffa06b7ab0>] ? drbd_rs_del_all+0x4d0/0x4d0 [drbd]
[ 2761.117419] [<ffffffff8108128b>] ? kthread+0xcb/0x100
[ 2761.117427] [<ffffffff8164d6be>] ? ret_from_fork+0x1e/0x50
[ 2761.117434] [<ffffffff810811c0>] ? kthread_create_on_node+0x190/0x190
[ 5183.575521] block drbd1: role( Primary -> Secondary )
[ 5190.283648] drbd drbd_32c: peer( Secondary -> Unknown ) conn( WFBitMapS -> TearDown )
[ 5190.283714] drbd drbd_32c: ack_receiver terminated
[ 5190.283719] drbd drbd_32c: Terminating drbd_a_drbd_32c

Re: PAX: size overflow detected in function drbd_send_bitmap

PostPosted: Tue Sep 06, 2016 6:02 am
by rot
I have the same error in linux-4.7.2-hardened-r1
The newest kernel version I tried in gentoo hardened without this problem is linux-3.18.9-hardened
linux-4.0.8-hardened already has this problem

br1, you can do:
drbdadm create-md all
drbdadm up all
drbdadm invalidate all
This method doesn't causes Pax overflow for me, but restarting one node, can make error on second node.

Re: PAX: size overflow detected in function drbd_send_bitmap

PostPosted: Wed Sep 07, 2016 4:37 am
by br1
thank you rot, it worked.

Re: PAX: size overflow detected in function drbd_send_bitmap

PostPosted: Thu Sep 08, 2016 4:15 am
by PaX Team
thanks for the reports guys, what happens is that gcc makes a check inserted by the plugin unconditional (i.e., it eliminates the corresponding range check as if it could prove that the overflow always triggers), we're investigating it.

Re: PAX: size overflow detected in function drbd_send_bitmap

PostPosted: Thu Nov 24, 2016 10:12 am
by Arach
Is there any progress on this one? This bug also exists in stable 4.4-grsec.