Friday, May 15, 2009

Fun with assembly level debugging

Investigations into a failed regression test revealed that some allocations were breaking page boundaries and when these allocations were passed as buffers to bio's - the read operation failed. The reason for the failure was that DASD requires the bv_len to be on blocksize boundaries whereas if the earlier allocation was breaking across page boundaries without blocksize alignment. It was suspected that the failure was in the dasd_eckd module in dasd_eckd_build_cp():

rq_for_each_bio(bio, req) {
bio_for_each_segment(bv, bio, i) {
if (bv->bv_len & (blksize - 1))
/* Eckd can only do full blocks. */
return ERR_PTR(-EINVAL);

We needed to make sure that this was the actual point of failure but since we cannot modify the kernel on IBM zSeries machines, we needed to use console debugging to confirm the suspicion.

Firstly install kernel-debuginfo package which you will need for debugging.

* Get disassembly code of the module using objdump

objdump -d /lib/modules/2.6.9-67.EL/kernel/drivers/s390/block/dasd_eckd_mod.ko > dasd_eckd_dis

0000000000001ef0 :
1ef0: eb 6f f0 48 00 24 stmg %r6,%r15,72(%r15)
1ef6: c0 d0 00 00 00 00 larl %r13,1ef6
1efc: b9 04 00 ef lgr %r14,%r15
1f00: a7 fb fe 40 aghi %r15,-448
1f04: c0 c0 00 00 00 00 larl %r12,1f04
1f0a: e3 e0 f0 98 00 24 stg %r14,152(%r15)
1f10: e3 30 f0 e0 00 24 stg %r3,224(%r15)
.....
.....
2134: a7 0a ff ff ahi %r0,-1
2138: a7 aa 00 09 ahi %r10,9
213c: 18 51 lr %r5,%r1
213e: e3 e0 c0 00 00 04 lg %r14,0(%r12)
2144: e3 90 f1 18 00 04 lg %r9,280(%r15)
214a: 58 20 90 08 l %r2,8(%r9)
214e: 18 12 lr %r1,%r2
2150: 18 32 lr %r3,%r2
2152: 14 10 nr %r1,%r0
2154: 88 30 a0 00 srl %r3,0(%r10)
* 2158: a7 74 04 53 jne 29fe
215c: e3 10 90 00 00 04 lg %r1,0(%r9)
.....
.....
29fe: a7 29 ff ea lghi %r2,-22
2a02: a7 f4 06 1b j 3638

The line with the '*' was the suspected line which was doing the bv_len alignment check. The jump from this line to dasd_eckd_build_cp+0x29fe confirmed this. EINVAL = 22 and after loading the return value the function returns.

Now while setting breakpoints, remember not to use addresses or offsets from the objdump(if you do remember to subtract start addr of the function from the offset of the instruction). Or better yet you can use the disassembly output from GDB:

$ gdb /lib/modules/2.6.9-67.EL/kernel/drivers/s390/block/dasd_eckd_mod.ko

(gdb) disassemble dasd_eckd_build_cp
Dump of assembler code for function dasd_eckd_build_cp:
0x0000000000001ef0 : stmg %r6,%r15,72(%r15)
0x0000000000001ef6 : larl %r13,0x1ef6
0x0000000000001efc : lgr %r14,%r15
0x0000000000001f00 : aghi %r15,-448
0x0000000000001f04 : larl %r12,0x1f04
0x0000000000001f0a : stg %r14,152(%r15)
0x0000000000001f10 : stg %r3,224(%r15)
0x0000000000001f16 : stg %r2,216(%r15)
0x0000000000001f1c : mvi 359(%r15),134
0x0000000000001f20 : lg %r1,16(%r3)
0x0000000000001f26 : tml %r1,1
0x0000000000001f2a : mvc 232(8,%r15),88(%r2)
0x0000000000001f30 : je 0x1f40
0x0000000000001f34 : tml %r1,1
0x0000000000001f38 : mvi 359(%r15),133
0x0000000000001f3c : je 0x29fe
0x0000000000001f40 : lg %r2,232(%r15)
.....
.....

Remember that the offsets here are in decimal not in hex.

0x00000000000029fe : lghi %r2,-22

OK, so as you can see we are jumping to 0x29fe which is the above instruction for which the offset is 2830. So find out the address for dasd_eckd_build_cp from kallsyms

cat /proc/kallsyms > kallsyms.

Here it is 0x000000006088f27c, so we place a breakpoint on /x 0x000000006088f27c+ 2830

Go to the CP console and all the rest is to be done here:

#CP TR I PSWA 6088fd8a

- this will set a trace point at given address.

The reason we should not set a tracepoint on the above jump instructions is because we see many false positives and that makes debugging impossible.

You can query all tracepoints using #CP Q TR.

When you hit a tracepoint, you can do the following to continue #CP B

If you want to single step: #CP TR I
followed by #CP B
Each "#CP B" will take you forward one instruction.

When you want to end tracing and remove all traces do #CP TR END

If you want to break within a function address range:
#CP TR I R .

eg. #CP TR I R 6088f27c.200

To display registers you can use:
#CP D G - group registers
#CP D X - control registers
#CP D AR - access registers
#CP D PSW - display PSW

After insmod'ing a reproducer module, with the trace on - we hit the instruction which returns EINVAL thereby confirming the problem.

1 comment:

Anonymous said...

The information here is great. I will invite my friends here.

Thanks