I upgraded to sys-kernel/hardened-sources-4.7.9, but it changed nothing. Nevertheless I stuck to the 4.7.9 kernel for the debugging and everything here was done on the sys-kernel/hardened-sources-4.7.9 kernel from Gentoo.
I think I'm close to the solution, but I do not know if the linux reassembly code is "broken" or the pax overflow detection triggers in a case it should not.
Now I need some help how to address and get the problem fixed properly...
For my understanding any fragmented ipv6 (tcp?) packet will trigger the pax overflow detection and panic the kernel.
I already opened a Gentoo bug for the issue. It has some more data, including two full panic logs and a packet capture of packets causing the panic: https://bugs.gentoo.org/show_bug.cgi?id=597792
According to PAX the problematic code is the second line of of the following calculation from net/ipv6/reassembly.c:
- Code: Select all
end = offset + (ntohs(ipv6_hdr(skb)->payload_len) -
((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1))); // line 223 in net/ipv6/reassembly.c
Without changing the code above and just adding the first printk line from the patch below I do not see any real overflow, even if the casts are cutting off the higher bits. (And I think they are not) .
Here the debug output for two PAX Overflow reports (The Gentoo ticket has the full trace log, if needed):
- Code: Select all
Oct 22 22:32:01 gandalf kernel: DDD ip6_frag_queue offset=0; payload_len=4d8; fhdr1=f1c8a27e; ipv6_hdr=f1c8a276
Oct 22 22:32:01 gandalf kernel: DDD ip6_frag_queue offset=4d0; payload_len=199; fhdr1=f1c8a07e; ipv6_hdr=f1c8a076
Based on my (limited) understanding I tried to work around this Pax Overflow crash, and it did indeed work:
I still get debug output from the added printk lines, but no longer PAX Overflows.
Here the patch which kind of works for me:
- Code: Select all
--- /tmp/reassembly.c 2016-10-23 14:13:28.086253478 +0200
+++ net/ipv6/reassembly.c 2016-10-22 23:13:52.778959198 +0200
@@ -211,7 +211,7 @@
{
struct sk_buff *prev, *next;
struct net_device *dev;
- int offset, end;
+ int offset, end, temp;
struct net *net = dev_net(skb_dst(skb)->dev);
u8 ecn;
@@ -219,8 +219,14 @@
goto err;
offset = ntohs(fhdr->frag_off) & ~0x7;
- end = offset + (ntohs(ipv6_hdr(skb)->payload_len) -
- ((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1)));
+
+ printk ("DDD ip6_frag_queue offset=%x; payload_len=%x; fhdr1=%x; ipv6_hdr=%x\n", offset, ntohs(ipv6_hdr(skb)->payload_len), (u8 *)(fhdr + 1), (u8 *)(ipv6_hdr(skb) + 1));
+
+ temp = (u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1);
+ temp &= 0xff; //This line is probably not needed and wrong!!
+ end = offset + (ntohs(ipv6_hdr(skb)->payload_len) - temp);
+// ((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1)));
+ printk ("DDD ip6_frag_queue2 end=%x\n", end);
if ((unsigned int)end > IPV6_MAXPLEN) {
__IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
I've now removed the probably incorrect line "temp &= 0xff;" from my running kernel and that also seems to work. The added printk lines still get triggered but no complains from PAX any more.
I suspect the Pax Overflow detection gets somehow confused by the u8* casts, but then I'm only guessing here.
Here are some lines with the full patch above, but without the probably invalid line "temp &= 0xff;"
- Code: Select all
[ 296.213127] DDD ip6_frag_queue offset=0; payload_len=4d8; fhdr1=c557987e; ipv6_hdr=c5579876
[ 296.213157] DDD ip6_frag_queue2 end=4d0
[ 296.213175] DDD ip6_frag_queue offset=4d0; payload_len=8e; fhdr1=c557967e; ipv6_hdr=c5579676
[ 296.213189] DDD ip6_frag_queue2 end=556
I even removed the pax_size_overflow_report_only boot parameter and now only the debug code is triggered and PAX is fine.
Any pointers how to get that properly fixed?